-
-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@union / UnionRule for letting the engine figure out paths to products not known in advance #7116
Conversation
c57bfc7
to
166beac
Compare
1dd0377
to
78e5c31
Compare
Fixed and added some argument checking in 78e5c31 to overcome the otherwise-ambiguous reasoning about what means what in Get() with 2 vs 3 args. |
78e5c31
to
ac37919
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks neat. I think it's possible to make it simpler, and it would be good to see some actual usage within this PR.
@@ -57,6 +60,65 @@ def transitive_coroutine_rule(c): | |||
yield D(b) | |||
|
|||
|
|||
@union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even seeing it sketched out, it's not quite clear what real world usage will look like. Would you mind updating the BundleField
/SourceField
usecase
pants/src/python/pants/engine/legacy/graph.py
Lines 499 to 517 in f79b62b
class HydratedField(datatype(['name', 'value'])): | |
"""A wrapper for a fully constructed replacement kwarg for a HydratedTarget.""" | |
@rule(HydratedTarget, [Select(TargetAdaptorContainer)]) | |
def hydrate_target(target_adaptor_container): | |
target_adaptor = target_adaptor_container.value | |
"""Construct a HydratedTarget from a TargetAdaptor and hydrated versions of its adapted fields.""" | |
# Hydrate the fields of the adaptor and re-construct it. | |
hydrated_fields = yield [(Get(HydratedField, BundlesField, fa) | |
if type(fa) is BundlesField | |
else Get(HydratedField, SourcesField, fa)) | |
for fa in target_adaptor.field_adaptors] | |
kwargs = target_adaptor.kwargs() | |
for field in hydrated_fields: | |
kwargs[field.name] = field.value | |
yield HydratedTarget(target_adaptor.address, | |
TargetAdaptor(**kwargs), | |
tuple(target_adaptor.dependencies)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this part first after rebasing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this and it seemed to mysteriously work? It essentially just let us remove the if
in this comprehension you've highlighted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
One possibility (not sure whether it is an improvement though...) would be to do a function call rather than a decorator for union members.
That might look like:
@union
class HydrateableField(object): pass
class SourcesField(...): ...
def rules():
return [
union_rule(HydrateableField, SourcesField)
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There ended up being a ridiculous amount of confusion when trying to support use of @union_rule
as a decorator and as a function call, so I instead converted it to a datatype UnionRule
like we do with e.g. RootRule
, without any decorators, which makes much more sense anyway. I just didn't want to drop the possibility of it being a decorator too early.
src/python/pants/engine/native.py
Outdated
@@ -706,9 +716,26 @@ def func(constraint): | |||
return Function(self.context.to_key(constraint)) | |||
def tc(constraint): | |||
return TypeConstraint(self.context.to_key(constraint)) | |||
def flatten_type_map_to_constraints(type_constraints_map): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that part of the elegance of the implementation you suggested is that it shouldn't be necessary for the native engine to know about the union (yet... we could decide to do that later)...
If at RuleIndex
(or perhaps Scheduler
?) creation time you:
# expand a Get for a Union type...
Get(TypeX, Union)
# ...into Gets for all of its members:
Get(TypeX, UnionMember1), Get(TypeX, UnionMember2), .. #etc
... then there is no need for the engine itself to know about the existence of the union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually thinking now that we don't need to do any expansion, but rather just check all Get
s for union types (I believe we can/should do this at RuleIndex
creation time) are valid according to the registered union rules (not sure if this was your intended meaning). This gives us the guarantee you've described below of knowing that Get
s are valid statically, which allows the engine to just generate a Get
from the runtime type as usual. And that should allow us to remove any knowledge of unions from the engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually thinking now that we don't need to do any expansion, but rather just check all Gets for union types are valid according to the registered union rules
Not quite: I think all I was referring to was replacing each union-Get with all union-member-Gets. Assuming you do that, the RuleGraph
handles all validation for you.
I believe we can/should do this at RuleIndex creation time
I don't know if we can assume that a RuleIndex always represents a closed world (right now). But we do know that we have a closed world at Scheduler creation time... so at Scheduler creation time you could rewrite the contents of the RuleIndex to replace a union-Get with union-member-Gets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After diving into the implementation of removing all the union knowledge from the engine, I think that sounds correct (since that's where we're already linking up the Get
s for unions in this PR anyway), and actually may already be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was! It seems like all the engine work was purely to provide a more specific error message on failing to resolve a union somehow. I did also convert a couple panic!
s into throw()
s, which enabled the technique you describe in another comment (just making a Get
and letting the engine error out if there's no path).
src/rust/engine/src/externs.rs
Outdated
interns.insert(response.values.unwrap_one()), | ||
))) | ||
let product = TypeConstraint(interns.insert(response.products.unwrap_one())); | ||
let subject_declared_type = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. So I guess this is because at runtime (rather than at @rule
parse time), we'll see a Get
with a declared type that might be a union, and then we might want to know the members of the union in order to validate that the given value is a member of that union?
It feels like not actually checking the declared type here, and just using exactly the type of the subject value would be fine. The result would still be the same as if there had been multiple literal Get
s in the body of the @rule
... and by this point we would already have validated that those are correct in the rule graph.
AFAICT, that would mean this method would not need to change at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm parsing your comment correctly, after expanding the Get
s to include all the registered constituent union members as you described in another comment, this method would indeed not need to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have implemented this, all I had to do was make gen_get()
return an error instead of panicking!
ac37919
to
c145847
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks really great.
src/python/pants/engine/rules.py
Outdated
|
||
def add_type_transition_rule(union_rule): | ||
""" | ||
NB: This does not require that union bases be supplied to `def rules():`! not sure if that's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally fine with that, for the same reason you put in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have moved this to a comment noting that this is because the union type is never instantiated!
@contextmanager | ||
def _assert_execution_error(self, expected_msg): | ||
# TODO(#7303): use self.assertRaisesWithMessageContaining()! | ||
with self.assertRaises(ExecutionError) as cm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be self.assertRaisesRegexp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be, but I'm realizing now that this probably would be difficult to convert to self.assertRaisesWithMessageContaining()
anyway because of the way it preprocesses the exception message. That could be converted to a regexp match (which I tried to do at one point), but I think stripping the locations from the traceback is much more robust and was just going to remove this comment, actually.
hydrated_fields = yield [(Get(HydratedField, BundlesField, fa) | ||
if type(fa) is BundlesField | ||
else Get(HydratedField, SourcesField, fa)) | ||
hydrated_fields = yield [Get(HydratedField, HydrateableField, fa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome.
Closed #7117 since the engine doesn't know about union rules anymore! |
also test Get.extract_constraints()!
…d to test failure
a247b85
to
93e2f4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woo!
src/python/pants/engine/rules.py
Outdated
elif hasattr(entry, '__call__'): | ||
rule = getattr(entry, 'rule', None) | ||
if rule is None: | ||
raise TypeError("Expected callable {} to be decorated with @rule.".format(entry)) | ||
add_rule(rule) | ||
else: | ||
# TODO: update this message! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either do this, or flesh out the TODO explaining what update should happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Will do!
Problem
Resolves #4535.
It's currently difficult for rules to tell the engine "give me an instance of
y
from thisx
according to the registered rules in the rule graph" ifx
is not a specific type known in advance (see #4535 for use cases for this). The alternative is that upstream rules have to use conditional logic outside of the engine, which is very difficult to write in a way that combines with arbitrary rulesets from downstream plugins, and code written to get around this can be error-prone. This is sad because the rule graph is generated by combining rules in pants core as well as in plugins, so all the necessary information is there, we just can't make use of it.This PR introduces the
@union
decorator andUnionRule
type to allow for static resolution of "union" types inyield Get(Product, UnionType, subject)
statements as per this comment in #4535 and below.Solution
Python
subject_declared_type
field topants.engine.selectors.Get
instead of inferring it from the subject type (the 2-argument form ofGet
is still supported).Get.create_statically_from_rule_graph()
classmethod to make it clear when theGet
subject is populated or not.@union
andUnionRule
to describe union types which can be requested with ayield Get(Product, UnionType, subject)
in rule bodies (as per this comment on #4535).@union
classes are not registered indef rules(): ...
-- this distinction seems to make sense as union classes are never instantiated.union_rules
dict field inRuleIndex
which registers@union_rule()
decorators as a map fromunion base type -> [union members]
.union_rules
dict to the scheduler, and when addingGet
edges (in_register_task()
inscheduler.py
), check if thesubject_declared_type
is a union base, and if so add edges to all union members.HydrateableField
union base which is used to resolve (for now) eitherSourcesField
orBundlesField
in ayield Get()
.Result
Users can now dynamically add union members in plugins and backends to be processed by upstream rules using
yield Get(...)
which don't know anything about them, and with a static universe of known union members which the engine uses uses to type-check the subject of ayield Get(...)
at rule execution time.