Fix `ForkingTaskRunnerTest` #16323

kfaraz · 2024-04-23T11:07:57Z

ForkingTaskRunner uses static fields to track task counts which are reported in metrics.
This causes some tests like ForkingTaskRunnerTest.testInvalidTaskContextJavaOptsArray() to give false positives if the variables had been set by a preceding test.

Changes:

Use non-static fields to track task counts in ForkingTaskRunner
Update assertions in tests to ensure that the tests are idempotent.

IgorBerman · 2024-04-23T12:10:19Z

@kfaraz don't you think there should be
failedTaskCount .incrementAndGet();
near LOGGER.info(t, "Exception caught during execution"); at wrapping try-catch?
I mean when we pass bad arguments the IllegalArgumentException is thrown but the code never increments failedTaskCount

IgorBerman · 2024-04-23T12:18:31Z

btw, those counters used in metrics reporting, so removing static will cause change in those metrics... i mean from test perspective static causing races, but if we looking at metric level and WorkerTaskCountStatsProvider I can understand why it's static since it's "Proides task / task count status at the level of individual worker nodes"

IgorBerman · 2024-04-23T12:31:42Z

if testing framework makes sure that each test runs one after another and not in parallel, other option would be to reset static variables in some visible for test method.
Additional option would be to use some interface that a) decouples from ForkingTaskRunnerTest & WorkerTaskCountStatsProvider and b) "collects" reports from ForkingTaskRunnerTest. for tests it will 1 implementation, race free.
for production it can wrap those static variables and WorkerTaskCountStatsProvider will take those reports from this interface

kfaraz · 2024-04-24T02:54:05Z

but if we looking at metric level and WorkerTaskCountStatsProvider I can understand why it's static since it's "Proides task / task count status at the level of individual worker nodes"

@IgorBerman , I am not sure I follow. How would this be affected if we make the fields non-static?
ForkingTaskRunner is used only by MiddleManagers and each MiddleManager would have only a single instance of it. Keeping the fields static doesn't serve any purpose, afaict.

kfaraz · 2024-04-24T02:56:01Z

don't you think there should be
failedTaskCount .incrementAndGet();
near LOGGER.info(t, "Exception caught during execution"); at wrapping try-catch?

Yes, I wanted to do this but I decided to do it later as I need to look at all the exceptions and if it would be better to not throw an exception at all and simply return a TaskStatus.failure() in those cases. But I need to test it properly before making those changes.

The changes in this PR are more straightforward and just meant to get the build right without altering any behaviour.

IgorBerman · 2024-04-24T05:07:22Z

oh, ok, if it's singleton then it will work and no need in static. I missed this part

AmatyaAvadhanula · 2024-04-24T07:38:14Z

indexing-service/src/test/java/org/apache/druid/indexing/overlord/ForkingTaskRunnerTest.java

@@ -494,7 +493,7 @@ int waitForTaskProcessToComplete(Task task, ProcessHolder processHolder, File lo
                                              + task.getId()
                                              + " must be an array of strings.")
    );
-    Assert.assertEquals(1L, (long) forkingTaskRunner.getWorkerFailedTaskCount());
+    Assert.assertEquals(0L, (long) forkingTaskRunner.getWorkerFailedTaskCount());


Why are both failed and successful counts 0?

Yes, ideally the failed count should be 1, but it is currently not being tracked.
We can fix up the code later as mentioned here: #16323 (comment).

AmatyaAvadhanula

Thanks for the clarification, @kfaraz. LGTM

kfaraz · 2024-04-24T08:35:20Z

Thanks for the review, @AmatyaAvadhanula , @IgorBerman !

kfaraz added 2 commits April 23, 2024 16:26

Fix ForkingTaskRunnerTest

abce71a

Fix up tests

d279fc5

github-actions bot added the Area - Ingestion label Apr 23, 2024

kfaraz mentioned this pull request Apr 23, 2024

feature-13324 contol loading lookups in peons #16266

Closed

10 tasks

kfaraz changed the title ~~Fix fork runner test~~ Fix ForkingTaskRunnerTest Apr 24, 2024

AmatyaAvadhanula reviewed Apr 24, 2024

View reviewed changes

AmatyaAvadhanula approved these changes Apr 24, 2024

View reviewed changes

kfaraz merged commit 1dabb02 into apache:master Apr 24, 2024
85 checks passed

kfaraz deleted the fix_fork_runner_test branch April 24, 2024 08:35

adarshsanjeev added this to the 30.0.0 milestone May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `ForkingTaskRunnerTest` #16323

Fix `ForkingTaskRunnerTest` #16323

kfaraz commented Apr 23, 2024 •

edited

Loading

IgorBerman commented Apr 23, 2024

IgorBerman commented Apr 23, 2024

IgorBerman commented Apr 23, 2024

kfaraz commented Apr 24, 2024

kfaraz commented Apr 24, 2024

IgorBerman commented Apr 24, 2024

AmatyaAvadhanula Apr 24, 2024

kfaraz Apr 24, 2024

AmatyaAvadhanula left a comment

kfaraz commented Apr 24, 2024

Fix ForkingTaskRunnerTest #16323

Fix ForkingTaskRunnerTest #16323

Conversation

kfaraz commented Apr 23, 2024 • edited Loading

IgorBerman commented Apr 23, 2024

IgorBerman commented Apr 23, 2024

IgorBerman commented Apr 23, 2024

kfaraz commented Apr 24, 2024

kfaraz commented Apr 24, 2024

IgorBerman commented Apr 24, 2024

AmatyaAvadhanula Apr 24, 2024

Choose a reason for hiding this comment

kfaraz Apr 24, 2024

Choose a reason for hiding this comment

AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

kfaraz commented Apr 24, 2024

Fix `ForkingTaskRunnerTest` #16323

Fix `ForkingTaskRunnerTest` #16323

kfaraz commented Apr 23, 2024 •

edited

Loading