Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TestLabelNames_Cancelled flakyness #8434

Closed
wants to merge 1 commit into from

Conversation

pracucci
Copy link
Collaborator

@pracucci pracucci commented Jun 20, 2024

What this PR does

TestLabelNames_Cancelled is flaky because of this. To fix it, I propose to wait until snapshotter is effectively stopped in tests.

I start from the assumption that Snapshotter.Start() won't be called more than once, otherwise it panics. But Snapshotter didn't support multiple start-stop cycles even before (because it would panic in Snapshotter.Stop().

Which issue(s) this PR fixes or relates to

Fixes #8428

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

Signed-off-by: Marco Pracucci <marco@pracucci.com>
@pracucci pracucci marked this pull request as ready for review June 20, 2024 09:02
@pracucci pracucci requested a review from a team as a code owner June 20, 2024 09:02
Comment on lines 264 to +272
func (s *BucketStore) RemoveBlocksAndClose() error {
return s.removeBlocksAndClose(context.Background(), false)
}

// RemoveBlocksCloseAndWait remove all blocks from local disk, releases all resources associated with the BucketStore
// and waits until all dependencies have been stopped.
func (s *BucketStore) RemoveBlocksCloseAndWait(ctx context.Context) error {
return s.removeBlocksAndClose(ctx, true)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be having this synchronous in all cases. If the snapshotter shuts down async, then it can create a file after the tenant directory has been cleaned up. Previously RemoveBlocksAndClose has been synchronous anyways.

having said that, it's also ok to not necessarily propagate the context all the way to RemoveBlocksCloseAndWait. This will be done soon with #8389

Copy link
Contributor

@narqo narqo Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having said that, it's also ok to not necessarily propagate the context all the way to RemoveBlocksCloseAndWait.

I think we need to, actually. At least for a case when the first syncBlocks fails (or was cancelled), the Snapshotter won't start. In this case, the call to Snapshotter.Stop will block, waiting on an empty channel.

With the above at hand, I wonder if it would be simpler to replace Snapshotter.stopped channel with a WaitGroup:

func Start() {
  wg.Add(1)
  go func() {
      defer wg.Done()
   }
}

func Stop() {
  close(stop)
  wg.Wait()
}

Although, the above it a (nit), given the plans for #8389

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's keep the current implementation. This is a nightmare to navigate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flaky TestLabelNames_Cancelled
3 participants