Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributor: Prevent data race on worker pool #9585

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

julienduchesne
Copy link
Member

Without the mutex:

==================
WARNING: DATA RACE
Write at 0x00c00011dc10 by goroutine 243:
  runtime.recvDirect()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:388 +0x7c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Close()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/concurrency/worker.go:38 +0x3c
  github.com/grafana/mimir/pkg/distributor.New.func9()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:531 +0x18
  github.com/grafana/dskit/services.(*BasicService).main()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:210 +0x3d0
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x34

Previous read at 0x00c00011dc10 by goroutine 245:
  runtime.chansend1()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:157 +0x2c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Go()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/concurrency/worker.go:28 +0x48
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Go-fm()
      <autogenerated>:1 +0x3c
  github.com/grafana/dskit/ring.DoBatchWithOptions()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/ring/batch.go:188 +0xa64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).sendWriteRequestToIngesters()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1572 +0x194
  github.com/grafana/mimir/pkg/distributor.(*Distributor).sendWriteRequestToBackends()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1529 +0xa48
  github.com/grafana/mimir/pkg/distributor.(*Distributor).push()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1478 +0x638
  github.com/grafana/mimir/pkg/distributor.(*Distributor).push-fm()
      <autogenerated>:1 +0x4c
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushValidationMiddleware-fm.(*Distributor).prePushValidationMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1124 +0xde0
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushSortAndFilterMiddleware-fm.(*Distributor).prePushSortAndFilterMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:976 +0x294
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushRelabelMiddleware-fm.(*Distributor).prePushRelabelMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:931 +0x4e0
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushHaDedupeMiddleware-fm.(*Distributor).prePushHaDedupeMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:828 +0x2f0
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).metricsMiddleware-fm.(*Distributor).metricsMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1170 +0x48c
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).limitsMiddleware-fm.(*Distributor).limitsMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1360 +0x258
  github.com/grafana/mimir/pkg/distributor.(*Distributor).Push()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1388 +0x2d8
  github.com/grafana/mimir/pkg/distributor.TestDistributor_PushWithDoBatchWorkers_DataRace.func2()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor_test.go:344 +0xd4

Goroutine 243 (running) created at:
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x1ac
  github.com/grafana/dskit/services.(*BasicService).switchState()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:142 +0xd8
  github.com/grafana/dskit/services.(*BasicService).StartAsync()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0x84
  github.com/grafana/dskit/services.(*Manager).StartAsync()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/manager.go:87 +0xa0
  github.com/grafana/dskit/services.StartManagerAndAwaitHealthy()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/manager.go:375 +0xa8
  github.com/grafana/mimir/pkg/distributor.(*Distributor).starting()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:606 +0x50
  github.com/grafana/mimir/pkg/distributor.(*Distributor).starting-fm()
      <autogenerated>:1 +0x44
  github.com/grafana/dskit/services.(*BasicService).main()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:160 +0xac
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x34

Goroutine 245 (finished) created at:
  github.com/grafana/mimir/pkg/distributor.TestDistributor_PushWithDoBatchWorkers_DataRace()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor_test.go:341 +0x204
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x40
==================
--- FAIL: TestDistributor_PushWithDoBatchWorkers_DataRace (0.19s)
    testing.go:1399: race detected during execution of test
FAIL
FAIL    github.com/grafana/mimir/pkg/distributor        0.975s
FAIL

What this PR does

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

Without the mutex:
```
==================
WARNING: DATA RACE
Write at 0x00c00011dc10 by goroutine 243:
  runtime.recvDirect()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:388 +0x7c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Close()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/concurrency/worker.go:38 +0x3c
  github.com/grafana/mimir/pkg/distributor.New.func9()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:531 +0x18
  github.com/grafana/dskit/services.(*BasicService).main()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:210 +0x3d0
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x34

Previous read at 0x00c00011dc10 by goroutine 245:
  runtime.chansend1()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:157 +0x2c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Go()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/concurrency/worker.go:28 +0x48
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Go-fm()
      <autogenerated>:1 +0x3c
  github.com/grafana/dskit/ring.DoBatchWithOptions()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/ring/batch.go:188 +0xa64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).sendWriteRequestToIngesters()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1572 +0x194
  github.com/grafana/mimir/pkg/distributor.(*Distributor).sendWriteRequestToBackends()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1529 +0xa48
  github.com/grafana/mimir/pkg/distributor.(*Distributor).push()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1478 +0x638
  github.com/grafana/mimir/pkg/distributor.(*Distributor).push-fm()
      <autogenerated>:1 +0x4c
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushValidationMiddleware-fm.(*Distributor).prePushValidationMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1124 +0xde0
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushSortAndFilterMiddleware-fm.(*Distributor).prePushSortAndFilterMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:976 +0x294
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushRelabelMiddleware-fm.(*Distributor).prePushRelabelMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:931 +0x4e0
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).prePushHaDedupeMiddleware-fm.(*Distributor).prePushHaDedupeMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:828 +0x2f0
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).metricsMiddleware-fm.(*Distributor).metricsMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1170 +0x48c
  github.com/grafana/mimir/pkg/distributor.NextOrCleanup.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1372 +0x64
  github.com/grafana/mimir/pkg/distributor.(*Distributor).limitsMiddleware-fm.(*Distributor).limitsMiddleware.func1()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1360 +0x258
  github.com/grafana/mimir/pkg/distributor.(*Distributor).Push()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:1388 +0x2d8
  github.com/grafana/mimir/pkg/distributor.TestDistributor_PushWithDoBatchWorkers_DataRace.func2()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor_test.go:344 +0xd4

Goroutine 243 (running) created at:
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x1ac
  github.com/grafana/dskit/services.(*BasicService).switchState()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:142 +0xd8
  github.com/grafana/dskit/services.(*BasicService).StartAsync()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0x84
  github.com/grafana/dskit/services.(*Manager).StartAsync()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/manager.go:87 +0xa0
  github.com/grafana/dskit/services.StartManagerAndAwaitHealthy()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/manager.go:375 +0xa8
  github.com/grafana/mimir/pkg/distributor.(*Distributor).starting()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor.go:606 +0x50
  github.com/grafana/mimir/pkg/distributor.(*Distributor).starting-fm()
      <autogenerated>:1 +0x44
  github.com/grafana/dskit/services.(*BasicService).main()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:160 +0xac
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /Users/julienduchesne/Repos/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x34

Goroutine 245 (finished) created at:
  github.com/grafana/mimir/pkg/distributor.TestDistributor_PushWithDoBatchWorkers_DataRace()
      /Users/julienduchesne/Repos/mimir/pkg/distributor/distributor_test.go:341 +0x204
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x40
==================
--- FAIL: TestDistributor_PushWithDoBatchWorkers_DataRace (0.19s)
    testing.go:1399: race detected during execution of test
FAIL
FAIL    github.com/grafana/mimir/pkg/distributor        0.975s
FAIL
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant