-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable segments read/published stats on Compaction task completion reports #15947
Enable segments read/published stats on Compaction task completion reports #15947
Conversation
d1794dc
to
1a8de2a
Compare
1a8de2a
to
3a945ce
Compare
…ble-segment-stats-for-compaction
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...ce/src/main/java/org/apache/druid/indexing/common/IngestionStatsAndErrorsTaskReportData.java
Outdated
Show resolved
Hide resolved
.../org/apache/druid/indexing/common/task/batch/parallel/GeneratedPartitionsMetadataReport.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/druid/indexing/common/task/batch/parallel/GeneratedPartitionsReport.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…druid/indexing/common/task/batch/parallel/GeneratedPartitionsReport.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried these scenarios:
- index_parallel with no additional subtasks (wikipedia) - the task report has segmentsRead: null and segmentsPublished: null in the report
- index_parallel with more than 1 subtask (kttm1) - the task report has segmentsRead: null and segmentsPublished: null in the sub task and segmentsRead: 0 and semgentsPublished: 1 in the index_parallel task
- Issued a compaction task with no subtasks (kttm1) - the task report has segmentsRead: null and segmentsPublished: null
- issued a compaction task with sub tasks (wikipedia) - no reports for the sub tasks, but the compactTask has segmentsRead: 1 segmentsPublished: 1
- index_kafka on kttm - report has segmentsRead: null segmentsPublished: null
Since this change is meant to be scoped to just compaction tasks, I am -1 in it's current implementation because there are many different task reports than have the fields segmentsRead
and segmentsPublished
added to it returning null and index_parallel tasks that have a report that says it read 0 segments (while true, that's a little confusing).
If these fields did not show up in the task reports where they are not meant to, I think the change would be good.
…ble-segment-stats-for-compaction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better! Only other thing I noticed is that index_parallel jobs with no subtasks that used to have task reports before this change no longer show a task report after this change. Can you look into that issue please
integration-tests/src/test/java/org/apache/druid/tests/indexer/ITCompactionTaskTest.java
Outdated
Show resolved
Hide resolved
...service/src/main/java/org/apache/druid/indexing/common/ParallelCompactionTaskReportData.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
Here is the fix: #16042 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the new fields, @adithyachakilam . I have left some comments that would be more in line with the existing report class structure.
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...service/src/main/java/org/apache/druid/indexing/common/ParallelCompactionTaskReportData.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/druid/indexing/common/task/batch/parallel/GeneratedPartitionsReport.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, @adithyachakilam ! Requested minor changes, otherwise looks good.
...ce/src/main/java/org/apache/druid/indexing/common/IngestionStatsAndErrorsTaskReportData.java
Show resolved
Hide resolved
...in/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentGenerateTask.java
Outdated
Show resolved
Hide resolved
...in/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentGenerateTask.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
...in/java/org/apache/druid/indexing/common/task/batch/parallel/PartialSegmentGenerateTask.java
Outdated
Show resolved
Hide resolved
integration-tests/src/test/java/org/apache/druid/tests/indexer/ITCompactionTaskTest.java
Show resolved
Hide resolved
…ble-segment-stats-for-compaction
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Outdated
Show resolved
Hide resolved
…ble-segment-stats-for-compaction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for addressing the comments.
Description
This adds visibility into number of segments read/published by each parallel compaction task.
This PR adds new subclass for
IngestionStatsAndErrorsTaskReportData
to take new fields andParallelIndexSupervisorTask
to populate the segments read/published in different parallel compaction algos.Release note
Parallel compaction task completion reports now have segmentsRead and segmentsPublished fields to see how effective a compaction task is.
Key changed/added classes in this PR
ParallelIndexSupervisorTask.java
IngestionStatsAndErrorsTaskReportData.java
This PR has: