-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature-13324 contol loading lookups in peons #16266
Conversation
703dd32
to
7e83988
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@IgorBerman , thanks for your solution! It makes sense to optimize lookup loading, but we need to leave some room in this design for future improvements. Please read comments for details.
@@ -331,4 +331,9 @@ static TaskInfo<TaskIdentifier, TaskStatus> toTaskIdentifierInfo(TaskInfo<Task, | |||
taskInfo.getTask().getMetadata() | |||
); | |||
} | |||
|
|||
default boolean doesntNeedLookups() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method should rather be called needsLookups()
or loadLookups()
. There is no point naming this method in the negative.
Also, since we want to extend this feature in the future to be able to load lookups on demand, I think we should return an enum here with possible values NONE
, ALL
, ON_DEMAND
.
For the time-being, the behaviour for ALL
and ON_DEMAND
would be the same. But later, ON_DEMAND
can be optimized to load only required lookups.
So, overall:
default LookupLoadingMode needsLookups()
{
return LookupLoadingMode.ALL;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @kfaraz for the comments.
Makes sense to me.
I'll ping you when your suggestions will be addressed.
Also here is something that I've discovered lately and if you have some suggestions/ideas please share them: I've tried to start nano cluster with my changes and discovered that it's not that simple. TaskToolbox injects LookupNodeService, i.e. currently it's not possible to remove it without additional refactoring, so I'm trying to convert it to "Lazy" one by injecting Providing instead so that LookupSerdeModule could create this Provider as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed interface @kfaraz
@@ -379,6 +379,11 @@ public TaskStatus call() | |||
command.add("true"); | |||
} | |||
|
|||
if (task.doesntNeedLookups()) { | |||
command.add("--doesntNeedLookups"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter should be called --loadLookups
instead, and should take values all
, none
, onDemand
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
4443e93
to
42fc8e3
Compare
42fc8e3
to
199e726
Compare
run nano cluster locally and verified that all services started, ingestion from kafka works and compaction also able to work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @IgorBerman ! Overall, this looks like a neat approach to me. I have left some comments.
server/src/main/java/org/apache/druid/query/lookup/LookupLoadingMode.java
Show resolved
Hide resolved
@@ -185,6 +187,13 @@ public class CliPeon extends GuiceRunnable | |||
@Option(name = "--loadBroadcastSegments", title = "loadBroadcastSegments", description = "Enable loading of broadcast segments") | |||
public String loadBroadcastSegments = "false"; | |||
|
|||
|
|||
/** | |||
* If set to true then lookups won't be loaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This javadoc needs to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -363,7 +372,7 @@ public String getTaskIDFromTask(final Task task) | |||
new IndexingServiceTuningConfigModule(), | |||
new InputSourceModule(), | |||
new ChatHandlerServerModule(properties), | |||
new LookupModule() | |||
LookupLoadingMode.NONE == loadLookups ? new LookupSerdeModule() : new LookupModule() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put this logic in a private method getLookupModule()
and add a javadoc there explaining what we are doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A log line which indicates the chosen lookup loading mode and the expected effects would also be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added log and private method. also added tests for it
@@ -537,6 +539,115 @@ ProcessHolder runTaskProcess(List<String> command, File logFile, TaskLocation ta | |||
Assert.assertTrue(forkingTaskRunner.restore().isEmpty()); | |||
} | |||
|
|||
@Test | |||
public void testTaskRun_doesnt_pass_loadLookups_Argument() throws Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other tests in this class don't use the underscore style of naming tests. Please adhere to the existing style in this class if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
server/src/main/java/org/apache/druid/query/lookup/LookupSerdeModule.java
Show resolved
Hide resolved
dc80221
to
fe1b2cb
Compare
@kfaraz Hi, I think I addressed all of your comments. Also hopefully fixed coverage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for promptly addressing the feedback! I have left a few more suggestions.
indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java
Show resolved
Hide resolved
Assert.assertTrue(peon.resolveLookupModule() instanceof LookupSerdeModule); | ||
} | ||
|
||
@Test(expected = IllegalArgumentException.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Assert.assertThrows()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced with assertThrows
@@ -58,6 +59,9 @@ public List<? extends Module> getJacksonModules() | |||
public void configure(Binder binder) | |||
{ | |||
JsonConfigProvider.bind(binder, LookupModule.PROPERTY_BASE, LookupConfig.class); | |||
//Even though LookupSerdeModule will be used for processes that are not loading lookups, we need to bind LookupNodeService, | |||
//so that TaskToolboxFactory and TaskToolbox will get their dependency. Binding it won't load lookups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I suppose we could have just made this dependency @Nullable
in TaskToolboxFactory
.
The only usage is to build a DruidDiscoveryNode
where the LookupNodeService
object is used to populate the services map.
From the point of service discovery, I think it would be more appropriate to just not add the lookup node service to the map rather than declaring a dummy node service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tried this and got following error for compaction task
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors:
1) Could not find a suitable constructor in org.apache.druid.discovery.LookupNodeService. Classes must have either one (and only one) constructor annotated with @Inject or a zero-argument constructor that is not private.
at org.apache.druid.discovery.LookupNodeService.class(LookupNodeService.java:38)
while locating org.apache.druid.discovery.LookupNodeService
for the 26th parameter of org.apache.druid.indexing.common.TaskToolboxFactory.<init>(TaskToolboxFactory.java:165)
at org.apache.druid.cli.CliPeon.bindTaskConfigAndClients(CliPeon.java:506) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)
1 error
at org.apache.druid.cli.CliPeon.run(CliPeon.java:421)
at org.apache.druid.cli.Main.main(Main.java:112)
Caused by: java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with Guice(using spring boot at work) so might be there is other option to have Optional dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually @nullable is the way to mark it optional...so I'm not sure why it doesn't work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe in additional to @nullable need to add empty c'tor to LookupNodeService ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, we can try to address this in a follow up.
server/src/main/java/org/apache/druid/query/lookup/LookupLoadingMode.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/query/lookup/LookupSerdeModule.java
Outdated
Show resolved
Hide resolved
...d-extensions/src/test/java/org/apache/druid/k8s/overlord/taskadapter/K8sTaskAdapterTest.java
Show resolved
Hide resolved
b559004
to
92144e8
Compare
@kfaraz thanks for the review! I think I've addressed all your comments besides binding of Lookup service. Tried your idea but seems compaction peon fails with this approach. |
1de34f8
to
1355717
Compare
indexing-service/src/test/java/org/apache/druid/indexing/overlord/ForkingTaskRunnerTest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/query/lookup/LookupLoadingMode.java
Outdated
Show resolved
Hide resolved
dd23158
to
57c765f
Compare
I see that 1 test failing: org.apache.druid.indexing.overlord.ForkingTaskRunnerTest#testInvalidTaskContextJavaOptsArray both of them updated at
which doesn't happen when parsing of arguments failing here
which throws exception and which is goes up to outer
i.e. those static variables are not incremented as part of failure to parse arguments(maybe should be fixed) |
@zachjsh Hi, maybe you can advice regarding org.apache.druid.indexing.overlord.ForkingTaskRunnerTest#testInvalidTaskContextJavaOptsArray and updating FAILED_TASK_COUNT & SUCCESSFUL_TASK_COUNT? |
run same test on master branch and it's failing as well, which is strange |
Yeah, oddly enough, for me, the test fails only when I run it individually. When I run all the tests in this class from the IDE together, it passes. It is most likely because the failed/success task counts are static. A test that runs before this one must be setting the failed count to 1, which is being verified in this test too. |
@kfaraz can you tell me what I should do in addition for this PR to be merged? |
@IgorBerman , I have created #16323 to fix up the test for now so that the builds pass. Later, we can revisit the code and decide if we want to increment the failed task count in the case where the javaOptsArray is invalid. |
thanks @kfaraz , I left 1 comment there |
@IgorBerman , I have merged #16323 . You can try to merge the latest master into your branch to fix the errors. |
57c765f
to
a120dfe
Compare
thanks @kfaraz , updated my branch with upstream |
@IgorBerman , FYI, there is another PR #16328 which tackles the same problem as this one. Let me know what you think about it. |
@kfaraz thanks :) I looked at it and it seems nice approach. I think the other PR is more generic since it permits to control which lookups will be loaded. I haven't understood if it supports configuring it as some context arguments for task. In our case it's always all or nothing, but I can imagine more involved environments. Additional diff that I can think of is that if we are not going to load lookups current PR uses different guice module which might create lighter runtime(i.e. less beans e.g., but I'm not sure this is the case since I'm not too familiar with differences of those 2 modules really). |
The author intends to create that as a follow up PR.
Thanks, @IgorBerman . |
Fixes #13324
Permits to disable lookups loading for some types of peons
for example compaction or kill tasks
this should improve memory footprint
e.g. for compaction we can use this memory for loading more rows into memory and thus improving throughput of compaction
I've followed same path as loadBroadcastSegments parameter works and reused LookupSerdeModule which already used in multiple services that doesn't need to load lookups
The fix adds doesntNeedLookups method to interface of task with default of false
Release note
Key changed/added classes in this PR
Task
- added doesntNeedLookups with default implementation that returns false, i.e. most of peons will continue to load lookupsCompactionTask
- overrides it with trueCliPeon
- refers to the process parameter and loads relevant lookup module(which either loads lookups or not)This PR has: