Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store the first raw value of a chunk during downsampling #1709

Merged
merged 5 commits into from
Nov 9, 2019

Conversation

alfred-landrum
Copy link
Contributor

@alfred-landrum alfred-landrum commented Nov 3, 2019

As discussed in #1568, storing only the last raw value
of a chunk will lose a counter reset when:
a) the reset occurs at a chunk boundary, and
b) the last raw value of the earlier chunk is less than
the first aggregated value of the later chunk.

This commit stores the first raw value of a chunk during
the initial raw aggregation, and retains it during
subsequent aggregations. This is similar to the existing
handling for the last raw value of a chunk.

With this change, when counterSeriesIterator iterates over
a chunk boundary, it will see the last raw value of the
earlier chunk, then the first raw value of the later chunk,
and then the first aggregated value of the later chunk. The
first raw value will always be less than or equal to the
first aggregated value, so the only difference in
counterSeriesIterator's output will be the possible detection
of a reset and an extra sample after the chunk boundary.

Fixes: #1568

Signed-off-by: Alfred Landrum alfred@leakybucket.org

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

As discussed in thanos-io#1568, storing only the last raw value
of a chunk will lose a counter reset when:
a) the reset occurs at a chunk boundary, and
b) the last raw value of the earlier chunk is less than
the first aggregated value of the later chunk.

This commit stores the first raw value of a chunk during
the initial raw aggregation, and retains it during
subsequent aggregations. This is similar to the existing
handling for the last raw value of a chunk.

With this change, when counterSeriesIterator iterates over
a chunk boundary, it will see the last raw value of the
earlier chunk, then the first raw value of the later chunk,
and then the first aggregated value of the later chunk. The
first raw value will always be less than or equal to the
first aggregated value, so the only difference in
counterSeriesIterator's output will be the possible detection
of a reset and an extra sample after the chunk boundary.

Fixes: thanos-io#1568

Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I don't have a way to really e2e test this case, but I read through the algorithm, and it makes sense to me 👍 Thanks! Small nit only. And thanks for awesome explanations on both issue and PR!

Just curious, did you also observed that in your real system in the actual query? (: if yes, did Thanos with this PR returns expected result?

Small style nit only from my side.

@brian-brazil could you take a look as well? (:

pkg/compact/downsample/downsample.go Show resolved Hide resolved
@@ -289,7 +289,13 @@ func (b *aggrChunkBuilder) add(t int64, aggr *aggregator) {
b.added++
}

func (b *aggrChunkBuilder) finalizeChunk(lastT int64, trueSample float64) {
func (b *aggrChunkBuilder) firstRawSample(firstT int64, trueSample float64) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those functions are quite shallow, and really the same. Can we maybe just inline with the comment? We use it twice, sure, but if we would inline them it might be even clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe but IMHO this split up is clear too since literally the function's name tells you what's happening 😄 up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've handled this in the latest diff by inlining the actions of the functions, but pointing them to new explanatory comments at CounterSeriesIterator, please take a look.

Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

pkg/compact/downsample/downsample.go Outdated Show resolved Hide resolved
pkg/compact/downsample/downsample_test.go Outdated Show resolved Hide resolved
@alfred-landrum
Copy link
Contributor Author

@bwplotka : regarding your question: I didn't observe this directly: my colleague @aponjavic spotted the potential issue as we were studying how Thanos implements downsampling. So I don't have an easy setup that repros the issue & the fix.

Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also update the comment around the type CounterSeriesIterator?

Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks!

BTW great talk on PromCon (: We might want to link slides in downsampling doc even.

@bwplotka bwplotka merged commit 3debaeb into thanos-io:master Nov 9, 2019
IKSIN pushed a commit to monitoring-tools/thanos that referenced this pull request Nov 26, 2019
)

* store the first raw value of a chunk during downsampling

As discussed in thanos-io#1568, storing only the last raw value
of a chunk will lose a counter reset when:
a) the reset occurs at a chunk boundary, and
b) the last raw value of the earlier chunk is less than
the first aggregated value of the later chunk.

This commit stores the first raw value of a chunk during
the initial raw aggregation, and retains it during
subsequent aggregations. This is similar to the existing
handling for the last raw value of a chunk.

With this change, when counterSeriesIterator iterates over
a chunk boundary, it will see the last raw value of the
earlier chunk, then the first raw value of the later chunk,
and then the first aggregated value of the later chunk. The
first raw value will always be less than or equal to the
first aggregated value, so the only difference in
counterSeriesIterator's output will be the possible detection
of a reset and an extra sample after the chunk boundary.

Fixes: thanos-io#1568

Signed-off-by: Alfred Landrum <alfred@leakybucket.org>

* changelog for thanos-io#1709

Signed-off-by: Alfred Landrum <alfred@leakybucket.org>

* adjust existing downsampling tests

Signed-off-by: Alfred Landrum <alfred@leakybucket.org>

* add counter aggregation comments to CounterSeriesIterator

Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
Signed-off-by: Aleksey Sin <asin@ozon.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

counter reset on aggregation boundaries
4 participants