-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NPE due to race with a closing series #3056
Conversation
3f616e6
to
a9d0f17
Compare
Codecov Report
@@ Coverage Diff @@
## master #3056 +/- ##
=======================================
Coverage 72.4% 72.4%
=======================================
Files 1099 1099
Lines 101973 101973
=======================================
Hits 73917 73917
Misses 22970 22970
Partials 5086 5086
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
544763d
to
f5d97ae
Compare
src/dbnode/storage/series/series.go
Outdated
// If the series is closed, the ID will be nil, causing the ID equality | ||
// check to panic. Further, closing the series resets the internal | ||
// cached blocks, so we don't need to continue with the logic below. | ||
if s.closed || !id.Equal(s.id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need closed
? I think it might be better to do nil checks, e.g.:
if id == nil || s.id == nil || !id.Equal(s.id) {
return
}
Also, it's not clear whether we need to be worried about s
itself being nil, but in that case:
if s == nil || id == nil {
return
}
s.Lock()
defer s.Unlock()
if s.id == nil || !id.Equal(s.id) {
return
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably ok not to check s
since that's a real concerning code bug that should be caught if calling on a nil ptr, but +1 to checking if if s.closed || s.id == nil || !id.Equal(s.id)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
src/dbnode/integration/setup.go
Outdated
@@ -421,7 +422,8 @@ func NewTestSetup( | |||
blockRetrieverMgr := block.NewDatabaseBlockRetrieverManager( | |||
func(md namespace.Metadata, shardSet sharding.ShardSet) (block.DatabaseBlockRetriever, error) { | |||
retrieverOpts := fs.NewBlockRetrieverOptions(). | |||
SetBlockLeaseManager(blockLeaseManager) | |||
SetBlockLeaseManager(blockLeaseManager). | |||
SetCacheBlocksOnRetrieve(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, can we avoid setting this to true
for all tests and just opt this in for TestWiredListPanic
?
By default we set this to false for all open source builds now so it's not really indicative of the broadly deployed set of configurations is (and would rather keep the majority of the integration tests representative of that if possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than minor comments
f5d97ae
to
6d7f56b
Compare
* master: (22 commits) Remove deprecated fields (#3327) Add quotas to Permits (#3333) [aggregator] Drop messages that have a drop policy applied (#3341) Fix NPE due to race with a closing series (#3056) [coordinator] Apply auto-mapping rules if-and-only-if no drop policies are in effect (#3339) [aggregator] Add validation in AddTimedWithStagedMetadatas (#3338) [coordinator] Fix panic in Ready endpoint for admin coordinator (#3335) [instrument] Config option to emit detailed Go runtime metrics only (#3332) [aggregator] Sort heap in one go, instead of iterating one-by-one (#3331) [pool] Add support for dynamic, sync.Pool backed, object pools (#3334) Enable PANIC_ON_INVARIANT_VIOLATED for tests (#3326) [aggregator] CanLead for unflushed window takes BufferPast into account (#3328) Optimize StagedMetadatas conversion (#3330) [m3msg] Improve message scan performance (#3319) [dbnode] Add reason tag to bootstrap retries metric (#3317) [coordinator] Enable rule filtering on prom metric type (#3325) Update m3dbnode-all-config.yml (#3204) [coordinator] Include Type in RollupOp.Equal (#3322) [coordinator] Simplify iteration logic of matchRollupTarget (#3321) [coordinator] Add rollup type to remove specific dimensions (#3318) ...
Fixes #2573 by adding a
closed
flag in the series so that we don't attempt to use a nil ID when the series gets evicted from the wired list.This adds
TestWiredListPanic
that repros #2573 after running for a long duration (30 minutes or longer).New options were introduced to increase the frequency of ticks so that the test is more likely to hit the race condition bug.
Special notes for your reviewer:
Does this PR introduce a user-facing and/or backwards incompatible change?:
Does this PR require updating code package or user-facing documentation?: