Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@union / UnionRule for letting the engine figure out paths to products not known in advance #7116

Merged
merged 19 commits into from
Mar 6, 2019

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Jan 20, 2019

Problem

Resolves #4535.

It's currently difficult for rules to tell the engine "give me an instance of y from this x according to the registered rules in the rule graph" if x is not a specific type known in advance (see #4535 for use cases for this). The alternative is that upstream rules have to use conditional logic outside of the engine, which is very difficult to write in a way that combines with arbitrary rulesets from downstream plugins, and code written to get around this can be error-prone. This is sad because the rule graph is generated by combining rules in pants core as well as in plugins, so all the necessary information is there, we just can't make use of it.

This PR introduces the @union decorator and UnionRule type to allow for static resolution of "union" types in yield Get(Product, UnionType, subject) statements as per this comment in #4535 and below.

Solution

Python

  • Add a subject_declared_type field to pants.engine.selectors.Get instead of inferring it from the subject type (the 2-argument form of Get is still supported).
  • Introduce Get.create_statically_from_rule_graph() classmethod to make it clear when the Get subject is populated or not.
  • Introduce @union and UnionRule to describe union types which can be requested with a yield Get(Product, UnionType, subject) in rule bodies (as per this comment on #4535).
    • Note that @union classes are not registered in def rules(): ... -- this distinction seems to make sense as union classes are never instantiated.
  • Create a really simple union_rules dict field in RuleIndex which registers @union_rule() decorators as a map from union base type -> [union members].
  • Propagate the union_rules dict to the scheduler, and when adding Get edges (in _register_task() in scheduler.py), check if the subject_declared_type is a union base, and if so add edges to all union members.
  • Create a HydrateableField union base which is used to resolve (for now) either SourcesField or BundlesField in a yield Get().

Result

Users can now dynamically add union members in plugins and backends to be processed by upstream rules using yield Get(...) which don't know anything about them, and with a static universe of known union members which the engine uses uses to type-check the subject of a yield Get(...) at rule execution time.

@cosmicexplorer cosmicexplorer changed the title [WIP] @union_rule (or something along those lines) @union_rule (or something along those lines) Jan 20, 2019
@cosmicexplorer
Copy link
Contributor Author

Fixed the unit test errors by moving Get typechecking to scheduler.py in 1dd0377 because we do that during rule execution now, and also added testing for Get.extract_constraints().

./pants --v2 --no-v1 list :: fails (see travis) which I will fix momentarily.

@cosmicexplorer
Copy link
Contributor Author

./pants --v2 --no-v1 list :: fails (see travis) which I will fix momentarily.

Fixed and added some argument checking in 78e5c31 to overcome the otherwise-ambiguous reasoning about what means what in Get() with 2 vs 3 args.

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks neat. I think it's possible to make it simpler, and it would be good to see some actual usage within this PR.

@@ -57,6 +60,65 @@ def transitive_coroutine_rule(c):
yield D(b)


@union
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even seeing it sketched out, it's not quite clear what real world usage will look like. Would you mind updating the BundleField/SourceField usecase

class HydratedField(datatype(['name', 'value'])):
"""A wrapper for a fully constructed replacement kwarg for a HydratedTarget."""
@rule(HydratedTarget, [Select(TargetAdaptorContainer)])
def hydrate_target(target_adaptor_container):
target_adaptor = target_adaptor_container.value
"""Construct a HydratedTarget from a TargetAdaptor and hydrated versions of its adapted fields."""
# Hydrate the fields of the adaptor and re-construct it.
hydrated_fields = yield [(Get(HydratedField, BundlesField, fa)
if type(fa) is BundlesField
else Get(HydratedField, SourcesField, fa))
for fa in target_adaptor.field_adaptors]
kwargs = target_adaptor.kwargs()
for field in hydrated_fields:
kwargs[field.name] = field.value
yield HydratedTarget(target_adaptor.address,
TargetAdaptor(**kwargs),
tuple(target_adaptor.dependencies))
here, to see how that informs the rest of the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this part first after rebasing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this and it seemed to mysteriously work? It essentially just let us remove the if in this comprehension you've highlighted.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

One possibility (not sure whether it is an improvement though...) would be to do a function call rather than a decorator for union members.

That might look like:

@union
class HydrateableField(object): pass

class SourcesField(...): ...

def rules():
  return [
      union_rule(HydrateableField, SourcesField)
    ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There ended up being a ridiculous amount of confusion when trying to support use of @union_rule as a decorator and as a function call, so I instead converted it to a datatype UnionRule like we do with e.g. RootRule, without any decorators, which makes much more sense anyway. I just didn't want to drop the possibility of it being a decorator too early.

@@ -706,9 +716,26 @@ def func(constraint):
return Function(self.context.to_key(constraint))
def tc(constraint):
return TypeConstraint(self.context.to_key(constraint))
def flatten_type_map_to_constraints(type_constraints_map):
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that part of the elegance of the implementation you suggested is that it shouldn't be necessary for the native engine to know about the union (yet... we could decide to do that later)...

If at RuleIndex (or perhaps Scheduler?) creation time you:

# expand a Get for a Union type...
Get(TypeX, Union)

# ...into Gets for all of its members:
Get(TypeX, UnionMember1), Get(TypeX, UnionMember2), .. #etc

... then there is no need for the engine itself to know about the existence of the union.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually thinking now that we don't need to do any expansion, but rather just check all Gets for union types (I believe we can/should do this at RuleIndex creation time) are valid according to the registered union rules (not sure if this was your intended meaning). This gives us the guarantee you've described below of knowing that Gets are valid statically, which allows the engine to just generate a Get from the runtime type as usual. And that should allow us to remove any knowledge of unions from the engine.

Copy link
Sponsor Member

@stuhood stuhood Mar 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually thinking now that we don't need to do any expansion, but rather just check all Gets for union types are valid according to the registered union rules

Not quite: I think all I was referring to was replacing each union-Get with all union-member-Gets. Assuming you do that, the RuleGraph handles all validation for you.

I believe we can/should do this at RuleIndex creation time

I don't know if we can assume that a RuleIndex always represents a closed world (right now). But we do know that we have a closed world at Scheduler creation time... so at Scheduler creation time you could rewrite the contents of the RuleIndex to replace a union-Get with union-member-Gets?

Copy link
Contributor Author

@cosmicexplorer cosmicexplorer Mar 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After diving into the implementation of removing all the union knowledge from the engine, I think that sounds correct (since that's where we're already linking up the Gets for unions in this PR anyway), and actually may already be done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was! It seems like all the engine work was purely to provide a more specific error message on failing to resolve a union somehow. I did also convert a couple panic!s into throw()s, which enabled the technique you describe in another comment (just making a Get and letting the engine error out if there's no path).

interns.insert(response.values.unwrap_one()),
)))
let product = TypeConstraint(interns.insert(response.products.unwrap_one()));
let subject_declared_type =
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. So I guess this is because at runtime (rather than at @rule parse time), we'll see a Get with a declared type that might be a union, and then we might want to know the members of the union in order to validate that the given value is a member of that union?

It feels like not actually checking the declared type here, and just using exactly the type of the subject value would be fine. The result would still be the same as if there had been multiple literal Gets in the body of the @rule... and by this point we would already have validated that those are correct in the rule graph.

AFAICT, that would mean this method would not need to change at all?

Copy link
Contributor Author

@cosmicexplorer cosmicexplorer Mar 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm parsing your comment correctly, after expanding the Gets to include all the registered constituent union members as you described in another comment, this method would indeed not need to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented this, all I had to do was make gen_get() return an error instead of panicking!

@cosmicexplorer cosmicexplorer changed the title @union_rule (or something along those lines) @union / UnionRule for letting the engine figure out paths to products not known in advance Mar 3, 2019
Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks really great.


def add_type_transition_rule(union_rule):
"""
NB: This does not require that union bases be supplied to `def rules():`! not sure if that's
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally fine with that, for the same reason you put in the description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have moved this to a comment noting that this is because the union type is never instantiated!

@contextmanager
def _assert_execution_error(self, expected_msg):
# TODO(#7303): use self.assertRaisesWithMessageContaining()!
with self.assertRaises(ExecutionError) as cm:
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be self.assertRaisesRegexp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be, but I'm realizing now that this probably would be difficult to convert to self.assertRaisesWithMessageContaining() anyway because of the way it preprocesses the exception message. That could be converted to a regexp match (which I tried to do at one point), but I think stripping the locations from the traceback is much more robust and was just going to remove this comment, actually.

hydrated_fields = yield [(Get(HydratedField, BundlesField, fa)
if type(fa) is BundlesField
else Get(HydratedField, SourcesField, fa))
hydrated_fields = yield [Get(HydratedField, HydrateableField, fa)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome.

@cosmicexplorer
Copy link
Contributor Author

Closed #7117 since the engine doesn't know about union rules anymore!

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woo!

elif hasattr(entry, '__call__'):
rule = getattr(entry, 'rule', None)
if rule is None:
raise TypeError("Expected callable {} to be decorated with @rule.".format(entry))
add_rule(rule)
else:
# TODO: update this message!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either do this, or flesh out the TODO explaining what update should happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Will do!

@stuhood stuhood merged commit e8ccd90 into pantsbuild:master Mar 6, 2019
@stuhood stuhood mentioned this pull request Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Design v2 Target API
3 participants