Skip to content

Latest commit

 

History

History
212 lines (167 loc) · 8.91 KB

0132-ufcs.md

File metadata and controls

212 lines (167 loc) · 8.91 KB
  • Start Date: 2014-03-17
  • RFC PR #: #132
  • Rust Issue #: #16293

Summary

This RFC describes a variety of extensions to allow any method to be used as first-class functions. The same extensions also allow for trait methods without receivers to be invoked in a more natural fashion.

First, at present, the notation path::method() can be used to invoke inherent methods on types. For example, Vec::new() is used to create an instance of a vector. This RFC extends that notion to also cover trait methods, so that something like T::size_of() or T::default() is legal.

Second, currently it is permitted to reference so-called "static methods" from traits using a function-like syntax. For example, one can write Default::default(). This RFC extends that notation so it can be used with any methods, whether or not they are defined with a receiver. (In fact, the distinction between static methods and other methods is completely erased, as per the method lookup of RFC PR #48.)

Third, we introduce an unambiguous if verbose notation that permits one to precisely specify a trait method and its receiver type in one form. Specifically, the notation <T as TraitRef>::item can be used to designate an item item, defined in a trait TraitRef, as implemented by the type T.

Motivation

There are several motivations:

  • There is a need for an unambiguous way to invoke methods. This is typically a fallback for when the more convenient invocation forms fail:
    • For example, when multiple traits are in scope that all define the same method for the same types, there must be a way to disambiguate which method you mean.
    • It is sometimes desirable not to have autoderef:
      • For methods like clone() that apply to almost all types, it is convenient to be more specific about which precise type you want to clone. To get this right with autoderef, one must know the precise rules being used, which is contrary to the "DWIM" intention.
      • For types that implement Deref<T>, UFCS can be used to unambiguously differentiate between methods invoked on the smart pointer itself and methods invoked on its referent.
  • There are many methods, such as SizeOf::size_of(), that return properties of the type alone and do not naturally take any argument that can be used to decide which trait impl you are referring to.
    • This proposal introduces a variety of ways to invoke such methods, varying in the amount of explicit information one includes:
      • T::size_of() -- shorthand, but only works if T is a path
      • <T>::size_of() -- infers the trait SizeOf based on the traits in scope, just as with a method call
      • <T as SizeOf>::size_of() -- completely unambiguous

Detailed design

Path syntax

The syntax of paths is extended as follows:

PATH = ID_SEGMENT { '::' ID_SEGMENT }
     | TYPE_SEGMENT { '::' ID_SEGMENT }
     | ASSOC_SEGMENT '::' ID_SEGMENT { '::' ID_SEGMENT }
ID_SEGMENT   = ID [ '::' '<' { TYPE ',' TYPE } '>' ]
TYPE_SEGMENT = '<' TYPE '>'
ASSOC_SEGMENT = '<' TYPE 'as' TRAIT_REFERENCE '>'

Examples of valid paths. In these examples, capitalized names refer to types (though this doesn't affect the grammar).

a::b::c
a::<T1,T2>::b::c
T::size_of
<T>::size_of
<T as SizeOf>::size_of
Eq::eq
Eq::<T>::eq
Zero::zero

Normalization of path that reference types

Whenever a path like ...::a::... resolves to a type (but not a trait), it is rewritten (internally) to <...::a>::....

Note that there is a subtle distinction between the following paths:

ToStr::to_str
<ToStr>::to_str

In the former, we are selecting the member to_str from the trait ToStr. The result is a function whose type is basically equivalent to:

fn to_str<Self:ToStr>(self: &Self) -> String

In the latter, we are selecting the member to_str from the type ToStr (i.e., an ToStr object). Resolving type members is different. In this case, it would yield a function roughly equivalent to:

fn to_str(self: &ToStr) -> String

This subtle distinction arises from the fact that we pun on the trait name to indicate both a type and a reference to the trait itself. In this case, depending on which interpretation we choose, the path resolution rules differ slightly.

Paths that begin with a TYPE_SEGMENT

When a path begins with a TYPE_SEGMENT, it is a type-relative path. If this is the complete path (e.g., <int>), then the path resolves to the specified type. If the path continues (e.g., <int>::size_of) then the next segment is resolved using the following procedure. The procedure is intended to mimic method lookup, and hence any changes to method lookup may also change the details of this lookup.

Given a path <T>::m::...:

  1. Search for members of inherent impls defined on T (if any) with the name m. If any are found, the path resolves to that item.
  2. Otherwise, let IN_SCOPE_TRAITS be the set of traits that are in scope and which contain a member named m:
    • Let IMPLEMENTED_TRAITS be those traits from IN_SCOPE_TRAITS for which an implementation exists that (may) apply to T.
      • There can be ambiguity in the case that T contains type inference variables.
    • If IMPLEMENTED_TRAITS is not a singleton set, report an ambiguity error. Otherwise, let TRAIT be the member of IMPLEMENTED_TRAITS.
    • If TRAIT is ambiguously implemented for T, report an ambiguity error and request further type information.
    • Otherwise, rewrite the path to <T as Trait>::m::... and continue.

Paths that begin with an ASSOC_SEGMENT

When a path begins with an ASSOC_SEGMENT, it is a reference to an associated item defined from a trait. Note that such paths must always have a follow-on member m (that is, <T as Trait> is not a complete path, but <T as Trait>::m is).

To resolve the path, first search for an applicable implementation of Trait for T. If no implementation can be found -- or the result is ambiguous -- then report an error.

Otherwise:

  • Determine the types of output type parameters for Trait from the implementation.
  • If output type parameters were specified in the path, ensure that they are compatible with those specified on the impl.
    • For example, if the path were <int as SomeTrait<uint>>, and the impl is declared as impl SomeTrait<char> for int, then an error would be reported because char and uint are not compatible.
  • Resolve the path to the member of the trait with the substitution composed of the output type parameters from the impl and Self => T.

Alternatives

We have explored a number of syntactic alternatives. This has been selected as being the only one that is simultaneously:

  • Tolerable to look at.
  • Able to convey all necessary information along with auxiliary information the user may want to verify:
    • Self type, type of trait, name of member, type output parameters

Here are some leading candidates that were considered along with their equivalents in the syntax proposed by this RFC. The reasons for their rejection are listed:

module::type::(Trait::member)    <module::type as Trait>::member
--> semantics of parentheses considered too subtle
--> cannot accommodate types that are not paths, like `[int]`

(type: Trait)::member            <type as Trait>::member
--> complicated to parse
--> cannot accommodate types that are not paths, like `[int]`

... (I can't remember all the rest)

One variation that is definitely possible is that we could use the : rather than the keyword as:

<type: Trait>::member            <type as Trait>::member
--> no real objection. `as` was chosen because it mimics the
    syntax for constructing a trait object.

Unresolved questions

Is there a better way to disambiguate a reference to a trait item ToStr::to_str versus a reference to a member of the object type <ToStr>::to_str? I personally do not think so: so long as we pun on the name of the trait, the potential for confusion will remain. Therefore, the only two possibilities I could come up with are to try and change the question:

  • One answer might be that we simply make the second form meaningless by prohibiting inherent impls on object types. But there remains a utility to being able to write something like <ToStr>::is_sized() (where is_sized() is an example of a trait fn that could apply to both sized and unsized types). Moreover, artificially restricting object types just for this reason doesn't seem right.

  • Another answer is to change the syntax of object types. I have sometimes considered that impl ToStr might be better suited as the object type and then ToStr could be used as syntactic sugar for a type parameter. But there exists a lot of precedent for the current approach and hence I think this is likely a bad idea (not to mention that it's a drastic change).