Have easier to way to tell the progress of compactor (and downsampling) #3985

bwplotka · 2021-03-29T09:43:18Z

Currently, you need to check Thanos Compact UI and check if all older blocks are bigger. There should be only up to 5 of 2h, 8h and 2d blocks. Rest should be compacted to 2w. Similar for downsampled blocks.

Or you can check the number of compactions per day to see if the number stabilizes.

Both ways are pretty manual. I would propose adding metric suggesting the backlog of compaction to make. This requires potentially changing our compaction planner logic, which is already planned for #3405

2nick · 2021-05-17T12:58:41Z

I'm trying to figure out the way of reporting progress and think that there are 2 possible ways:

API with JSON response
Prometheus metrics

In my opinion for both ways would be enough to give out:

block id
group id for the block
compaction state of the group
compaction state for the block

Probably it's reasonable to give a time of block's state change as a metric value.

OFC it's possible to implement own readers for all stages to report bytes progress, but it looks like "overkill".

WDYT?

stale · 2021-07-17T00:21:21Z

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

vanugrah · 2021-07-28T16:23:01Z

Yes please! I spent the weekend trying to answer that question since we have a huge compaction backlog and I couldn't think of a simple instrumentation to add. Largely because currently compaction planning is iterative, so we'd need to simulate multiple plan invocations per group to accurately determine how many compaction runs would need to happen per group to reach the desired state.

As a user - I'd love to see an overall compaction percentage for the bucket as well as compaction progress for each group. I'm going to think more about this problem and report back.

kernelpanic77 · 2021-08-18T11:01:02Z

Hello @bwplotka! I am relatively new to the Thanos project but would love to get involved and potentially contribute. Any pointers or resources for me to get started with Thanos. So that I could get a better understanding of the project and this issue?

Thanks,
Ishan

yeya24 · 2021-10-12T17:47:39Z

Yes please! I spent the weekend trying to answer that question since we have a huge compaction backlog and I couldn't think of a simple instrumentation to add. Largely because currently compaction planning is iterative, so we'd need to simulate multiple plan invocations per group to accurately determine how many compaction runs would need to happen per group to reach the desired state.

As a user - I'd love to see an overall compaction percentage for the bucket as well as compaction progress for each group. I'm going to think more about this problem and report back.

Sounds like a promising way to go!
We can have a single goroutine to run the planning simulation process. In that goroutine, we use grouper and planner to do planning based on metadata from fetchers.
We do planning based until there is no plan is available and count the number of iterations we need to do. This number represents the compactor progress.

Question:

Each compaction iteration means different work to do. For example, a plan for 2 level 2 blocks and a plan for multiple level 4 blocks. How can we quantify the work?

bwplotka · 2021-10-15T14:57:22Z

What we discussed in our 1:2 with @yeya24 @metonymic-smokey :

The idea to simulate planning sounds amazing. It will take more time (marginal) and complex code, but in the end we can (1) estimate compaction (2) plan it better (optimize!)

Each compaction iteration means different work to do. For example, a plan for 2 level 2 blocks and a plan for multiple level 4 blocks. How can we quantify the work?

We can estimate samples/bytes, but it will be an approximation (:

metonymic-smokey · 2021-10-24T06:25:13Z

Another idea that @yeya24 came up with is calculating retention progress, once we have finished working on compaction and downsampling progress. Broadly, it will also be on the same lines i.e. simulation and exporting metrics.

yeya24 · 2021-11-06T17:45:34Z

Another idea that @yeya24 came up with is calculating retention progress, once we have finished working on compaction and downsampling progress. Broadly, it will also be on the same lines i.e. simulation and exporting metrics.

Let's have another issue to track this.

bwplotka added feature request/improvement difficulty: hard help wanted labels Mar 29, 2021

yashrsharma44 mentioned this issue May 13, 2021

Improvements on Thanos Compactor #4233

Open

stale bot added the stale label Jul 17, 2021

stale bot removed the stale label Jul 28, 2021

bwplotka changed the title ~~Have easier to way to tell the progress of compactor~~ Have easier to way to tell the progress of compactor (and downsampling) Oct 12, 2021

bwplotka mentioned this issue Oct 12, 2021

compactor: Add metric for tracking progress of downsampling (: #3478

Closed

metonymic-smokey mentioned this issue Oct 26, 2021

Metrics for compaction and downsampling process #4801

Merged

2 tasks

yeya24 closed this as completed in #4801 Nov 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have easier to way to tell the progress of compactor (and downsampling) #3985

Have easier to way to tell the progress of compactor (and downsampling) #3985

bwplotka commented Mar 29, 2021

2nick commented May 17, 2021

stale bot commented Jul 17, 2021

vanugrah commented Jul 28, 2021

kernelpanic77 commented Aug 18, 2021

yeya24 commented Oct 12, 2021 •

edited

Loading

bwplotka commented Oct 15, 2021

metonymic-smokey commented Oct 24, 2021

yeya24 commented Nov 6, 2021

Have easier to way to tell the progress of compactor (and downsampling) #3985

Have easier to way to tell the progress of compactor (and downsampling) #3985

Comments

bwplotka commented Mar 29, 2021

2nick commented May 17, 2021

stale bot commented Jul 17, 2021

vanugrah commented Jul 28, 2021

kernelpanic77 commented Aug 18, 2021

yeya24 commented Oct 12, 2021 • edited Loading

bwplotka commented Oct 15, 2021

metonymic-smokey commented Oct 24, 2021

yeya24 commented Nov 6, 2021

yeya24 commented Oct 12, 2021 •

edited

Loading