Preserve task dependencies to futures passed as parameters in `.map` #6701

robalar · 2022-09-03T14:48:51Z

Preserves relationships between a mapped task and its parameters if that parameter is a future, closes #6686.

Example

from prefect import task, flow, get_run_logger


@task
def log_task(x: int):
    logger = get_run_logger()
    logger.info(x)


@task
def flat_numbers() -> list[int]:
    return [0, 1, 2, 3, 4]


@flow
def test_flow():
    x = flat_numbers.submit()
    log_task.map(x)

Will now correctly produce this graph:

Checklist

This pull request references any related issue by including "closes <link to issue>"
- If no issue exists and your change is not a small fix, please create an issue first.
This pull request includes tests or only affects documentation.
This pull request includes a fix, feature, enhancement, docs, or maintenance label categorizing the change.

netlify · 2022-09-03T14:48:53Z

❌ Deploy Preview for prefect-orion failed.

Built without sensitive environment variables

Name	Link
🔨 Latest commit	`835fd04`
🔍 Latest deploy log	https://app.netlify.com/sites/prefect-orion/deploys/631a3bf88806ba0008148a3a

src/prefect/engine.py

robalar · 2022-09-07T13:10:43Z

@madkinsz Have pushed a commit that passes all the test cases. It obviously uses wait_for but feel like it should extend when we start persisting the keys too.

I don't love that I had to add a special case to visit_collection but could not find any other way to recuse into resolved values and avoid infinite recursion.

I also found some tests in the same file that test relationships but only via values returned, I believe my tests would sit better next to them. As a side note, have you considered breaking up this file? It's really quite large and hard to orient/find things in.

zanieb · 2022-09-07T14:33:10Z

Hi again! visit_collection actually has a max_depth field. We just need to pass it through.

Here's a commit that:

Adds max depth to task input collection and resolution
Separates collection from resolution to avoid introducing new resolve future type
Uses max depth to avoid resolving futures below the top level
Adds extra_task_inputs passing through the engine so we do not have to use wait_for.

See 00d8647

I did not remove any of the functions you introduced and there may be some additional polish here. Are you interested in cherry-picking that commit into here and cleaning things up?

I also found some tests in the same file that test relationships but only via values returned, I believe my tests would sit better next to them.

I'd prefer to keep all the mapping tests separate from those other inputs tests for now since they're a special case. If you want to create a TestTaskMapInputs section that's fine with me.

As a side note, have you considered breaking up this file? It's really quite large and hard to orient/find things in.

We haven't yet! It seems reasonable in the long run but I mostly just navigate by class. We don't have strong norms about using classes vs files. I think I'd prefer files as we grow the test suite though.

robalar · 2022-09-07T14:44:30Z

Ah, that makes so much sense! The nesting was giving me a massive headache, but it didn't need to be! Feels obvious looking at it..... 😄

I did not remove any of the functions you introduced and there may be some additional polish here. Are you interested in cherry-picking that commit into here and cleaning things up?

Yes very happy to tidy up!

I'd prefer to keep all the mapping tests separate from those other inputs tests for now since they're a special case. If you want to create a TestTaskMapInputs section that's fine with me.

Noted, will split into its own section.

zanieb · 2022-09-08T15:16:00Z

Excellent work! Since I wrote some of this I'm going to request @bunchesofdonald as a reviewer. This looks good to me though.

robalar · 2022-09-08T15:23:53Z

@madkinsz thank you for your guidance on this! Not sure I can claim to have done too much of it, but glad to have helped it along!

bunchesofdonald

I think this looks great! That test suite is fantastic.

bunchesofdonald · 2022-09-08T17:18:35Z

src/prefect/engine.py

@@ -795,15 +801,14 @@ async def begin_task_map(
                wait_for=wait_for,
                return_type=return_type,
                task_runner=task_runner,
+                extra_task_inputs=task_inputs,


I'm not excited about extra_task_inputs, but I also can't think of a better approach at the moment. It may point at a possible need to refactor the way we validate the parameters to a map call or the way we gather at task inputs at some point.

Yeah me neither. I think ideally we have a way to track lengths on futures and create indexed children of futures. When the future has no known length, we resolve it to determine the length for mapping then access its indices (which creates child futures) that we pass downstream. That way, our existing tracking will work.

That said, users want to create dependencies between tasks in their graph without necessarily waiting for them to finish. extra_task_inputs may be useful for user-defined relationships that do not have runtime implications.

cc @WillRaphaelson for reference

tests/test_tasks.py

robalar · 2022-09-12T07:36:41Z

@madkinsz do you require any more action from me to get this merged? all tests are passing, netlify seems to be failing due to authentication issues to a private repo.

zanieb · 2022-09-12T15:33:23Z

See #6728 for some notes on that failure, it's expected right now. Thanks a bunch for contributing!

robalar requested review from cicdw and zanieb as code owners September 3, 2022 14:48

zanieb reviewed Sep 3, 2022

View reviewed changes

src/prefect/engine.py Outdated Show resolved Hide resolved

robalar force-pushed the bugfix/map-future-relationship branch from 9f0b164 to 0b4ec68 Compare September 8, 2022 15:11

robalar requested review from zanieb and removed request for cicdw September 8, 2022 15:12

zanieb added the fix A fix for a bug in an existing feature label Sep 8, 2022

zanieb requested a review from bunchesofdonald September 8, 2022 15:16

bunchesofdonald approved these changes Sep 8, 2022

View reviewed changes

Robert Hales added 2 commits September 8, 2022 20:00

Add test cases for dependency tracking

7d9f01d

Preserve relationships when mapping tasks

835fd04

robalar force-pushed the bugfix/map-future-relationship branch from 76b0058 to 835fd04 Compare September 8, 2022 19:01

zanieb merged commit f8935be into PrefectHQ:main Sep 12, 2022

robalar deleted the bugfix/map-future-relationship branch September 14, 2022 08:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve task dependencies to futures passed as parameters in `.map` #6701

Preserve task dependencies to futures passed as parameters in `.map` #6701

robalar commented Sep 3, 2022 •

edited by zanieb

Loading

netlify bot commented Sep 3, 2022 •

edited

Loading

robalar commented Sep 7, 2022

zanieb commented Sep 7, 2022 •

edited

Loading

robalar commented Sep 7, 2022

zanieb commented Sep 8, 2022

robalar commented Sep 8, 2022

bunchesofdonald left a comment

bunchesofdonald Sep 8, 2022

zanieb Sep 8, 2022 •

edited

Loading

zanieb Sep 8, 2022

robalar commented Sep 12, 2022

zanieb commented Sep 12, 2022

Preserve task dependencies to futures passed as parameters in .map #6701

Preserve task dependencies to futures passed as parameters in .map #6701

Conversation

robalar commented Sep 3, 2022 • edited by zanieb Loading

Example

Checklist

netlify bot commented Sep 3, 2022 • edited Loading

❌ Deploy Preview for prefect-orion failed.

robalar commented Sep 7, 2022

zanieb commented Sep 7, 2022 • edited Loading

robalar commented Sep 7, 2022

zanieb commented Sep 8, 2022

robalar commented Sep 8, 2022

bunchesofdonald left a comment

Choose a reason for hiding this comment

bunchesofdonald Sep 8, 2022

Choose a reason for hiding this comment

zanieb Sep 8, 2022 • edited Loading

Choose a reason for hiding this comment

zanieb Sep 8, 2022

Choose a reason for hiding this comment

robalar commented Sep 12, 2022

zanieb commented Sep 12, 2022

Preserve task dependencies to futures passed as parameters in `.map` #6701

Preserve task dependencies to futures passed as parameters in `.map` #6701

robalar commented Sep 3, 2022 •

edited by zanieb

Loading

netlify bot commented Sep 3, 2022 •

edited

Loading

zanieb commented Sep 7, 2022 •

edited

Loading

zanieb Sep 8, 2022 •

edited

Loading