Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Introduce rule-name-centric calling convention, and deprecate return-type-centric Get #18895

Closed
stuhood opened this issue May 3, 2023 · 7 comments

Comments

@stuhood
Copy link
Sponsor Member

stuhood commented May 3, 2023

tl;dr: This issue proposes introducing a syntax to call @rules by name, and deprecating the Get-by-return-type syntax in almost all cases (@unions TBD).

Motivation

Currently, rather than actually referring to the @rule that will be called, Gets use a return-type-centric syntax (await Get(SomeReturnType, SomeArgument(..))). But as our set of plugins has expanded and our understanding of @unions has matured, it's becoming clearer that one of the motivations for that syntax is probably no longer valid.

"@rule graph solving" (documented here) refers to the process of:

  1. choosing the @rule implementations to use for:
    1. each positional argument to a @rule
    2. each of a @rule's Gets
  2. determining the set of Params which must be part of the runtime graph Node for a @rule, because they will be used (transitively) to compute its dependencies

That second aspect of @rule graph solving (determining the set of Params that a @rule needs in order to run) is something which cannot practically be done by hand, and so it continues to be an essential component of composing @rules. But the first aspect (deciding which @rules will be called) can definitely be done by hand, since at a fundamental level, it can be reduced to the same sort of function calling convention that is used in most programming languages: calling a function by name.

There is no hardship involved in calling functions by name, but when the v2 engine was originally created, particular focus was paid to pluggability and extensibility. Avoiding calling functions by name meant that the implementations of your dependencies could be swapped out without mocking, and that "abstract" @rules (currently represented as @unions) would be on equal footing with other calls.

But it's become clear in the intervening time that explicitly introducing and tracking pluggability upstream via a set of declared @unions is more maintainable and easier to reason about than attempting to swap out @rules. And the cost/complexity of rule graph solving means that:

  1. @rule graph error messages remain incomprehensible
  2. startup time for pantsd is noticeable (with enough @rules installed, even in --release mode)
  3. feature development is inhibited by the inability to make fundamental changes to @rule graph solving (e.g. Re-enable concurrent runs for pantsd in v2 #7654 (comment))

To simplify and remove roughly half of the magic involved in rule graph solving, we should deprecate the Get-by-return-type syntax in favor of calling @rules by name (@unions TBD).

Syntax

One syntax for accomplishing this would be to adjust the @rule decorator to convert the function it is applied to into a Callable with adapted arguments, which returns a Get-shaped type. For a @rule like:

@rule
def my_rule(arg1: Arg1, arg2: Arg2) -> ReturnType:
    return ReturnType()

Callsites could allow positional arguments before a literal ... (ellipses). After the ellipses, they would use the existing relevant portion of Get syntax to provide any (transitive) parameters for the @rule call:

# Equivalent to `Get(ReturnType)` (i.e. no arguments to `Get`). All arguments would be
# computed from Params which were already in scope:
await the_rule_to_call(...)

# Using a positional arg. Roughly equivalent to `Get(ReturnType, Arg1())`: the remaining arguments would be
# computed from Params which were already in scope:
await the_rule_to_call(Arg1(), ...)

# Two positional args. Roughly equivalent to `Get(ReturnType, {Arg1(): Arg1, Arg2(): Arg2)`. Note that
# `@rule` graph solving is still necessary for this case, because the called `@rule` may have `Get`s
# which have additional dependencies.
await the_rule_to_call(Arg1(), Arg2())

# Equivalent to `Get(ReturnType, Arg1())`:
await the_rule_to_call(..., Arg1())

# Equivalent to `Get(ReturnType, {Arg1(): Arg1, Arg2})`:
await the_rule_to_call(..., {Arg1(): Arg1, Arg2(): Arg2})
@thejcannon
Copy link
Member

So just to be clear, we still aren't calling the rule. We're just shifting the convention and the magic to look closer to normal code?

@thejcannon
Copy link
Member

Oh and how does this work with providing an arg that isn't the request type, but is convertible through the engine to the request type?

@kaos
Copy link
Member

kaos commented May 4, 2023

I like the syntax as it feels and looks more concise. The described examples are a bit vague to me still though. Summarizing my understanding of it yields:

# Anything after ellipsis is equiv. with regular `Get`
await the_rule_to_call(..., *args)
await Get(*args)

# Anything before ellipsis will be complemented with in scope params as needed
await the_rule_to_call(*args, ...)
await Get(*args)  # adds params from scope =>
    Get(ReturnType, {*args, Param1: param1, Param2: param2})

# Not using ellipsis must provide all required params for the call as the params in scope will not be consulted; this would be the preferred recommended way to call rules in order to fully specify the requirements?
await the_rule_to_call(*args)
await Get(*args)  # w/o added params from scope

Was this somewhat accurate?

IIUC this merely directs the rule graph building to cut down the number of alternative rules for a dependency key, reducing the exponential explosion of variants going into monomorphization, at the cost of not being able to swap out rules by replacing backends, leaving @unions to be the only pluggable alternative.

I'm in favour of adding support for this call rule-by-name convention, but I'm vary of deprecating the Get-style call rule-by-return-type at the same time.

@kaos
Copy link
Member

kaos commented May 4, 2023

Oh and how does this work with providing an arg that isn't the request type, but is convertible through the engine to the request type?

I imagine the same thing would hold as today for this case, i.e. the engine would need to go out and find the chain of rules to use to get the required type from what you provided.

@kaos
Copy link
Member

kaos commented May 4, 2023

It is only in order to be able to deprecate call rule-by-return-type that an alternative for @unions is needed to be able to use the call rule-by-name, so for that I have this idea:

As there is no single rule to call by name for a @union, why not go with what we have, the union type:

@union 
class Car:
  pass

class Volvo:
  pass

Union(Car, Volvo)

# Old style:
await Get(Car, Volvo(...))

# New style, using the union type as the rule to call:
await Car(Volvo(...))

With the added benefit that it will be very clear at the call site that this is no regular rule we're calling but a union type.

This has some implementation implications, but I think they are manageable.

@thejcannon
Copy link
Member

thejcannon commented May 4, 2023

Hey @stuhood can we convert this to a discussion so we can reply in threads?
(There's a button at the bottom of the right column on this page that does that)

@thejcannon
Copy link
Member

Also out of curiosity, would this open the door for not using type-based returns in the engine for solving rule graph queries? We could endow the engine's representation of the return type with the name of the rule for differentiation. E.g. if we know the name of the rule, the following definitions are unambiguous in which one to use:

async foo(input: int) -> FrozenDict[...]: pass
async bar(input: int) -> FrozenDict[...]: pass

The engine creates a distinct type for each using the rule name(space) as a delineator, and rule parsing knows which should be used because now we're using the rule name.

@pantsbuild pantsbuild locked and limited conversation to collaborators May 4, 2023
@stuhood stuhood converted this issue into discussion #18905 May 4, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants