What are Rust's exact auto-dereferencing rules?
Your pseudo-code is pretty much correct. For this example, suppose we had a method call foo.bar()
where foo: T
. I'm going to use the fully qualified syntax (FQS) to be unambiguous about what type the method is being called with, e.g. A::bar(foo)
or A::bar(&***foo)
. I'm just going to write a pile of random capital letters, each one is just some arbitrary type/trait, except T
is always the type of the original variable foo
that the method is called on.
The core of the algorithm is:
- For each "dereference step"
U
(that is, setU = T
and thenU = *T
, ...)- if there's a method
bar
where the receiver type (the type ofself
in the method) matchesU
exactly , use it (a "by value method") - otherwise, add one auto-ref (take
&
or&mut
of the receiver), and, if some method's receiver matches&U
, use it (an "autorefd method")
- if there's a method
Notably, everything considers the "receiver type" of the method, not the Self
type of the trait, i.e. impl ... for Foo { fn method(&self) {} }
thinks about &Foo
when matching the method, and fn method2(&mut self)
would think about &mut Foo
when matching.
It is an error if there's ever multiple trait methods valid in the inner steps (that is, there can be only be zero or one trait methods valid in each of 1. or 2., but there can be one valid for each: the one from 1 will be taken first), and inherent methods take precedence over trait ones. It's also an error if we get to the end of the loop without finding anything that matches. It is also an error to have recursive Deref
implementations, which make the loop infinite (they'll hit the "recursion limit").
These rules seem to do-what-I-mean in most circumstances, although having the ability to write the unambiguous FQS form is very useful in some edge cases, and for sensible error messages for macro-generated code.
Only one auto-reference is added because
- if there was no bound, things get bad/slow, since every type can have an arbitrary number of references taken
- taking one reference
&foo
retains a strong connection tofoo
(it is the address offoo
itself), but taking more starts to lose it:&&foo
is the address of some temporary variable on the stack that stores&foo
.
Examples
Suppose we have a call foo.refm()
, if foo
has type:
-
X
, then we start withU = X
,refm
has receiver type&...
, so step 1 doesn't match, taking an auto-ref gives us&X
, and this does match (withSelf = X
), so the call isRefM::refm(&foo)
-
&X
, starts withU = &X
, which matches&self
in the first step (withSelf = X
), and so the call isRefM::refm(foo)
-
&&&&&X
, this doesn't match either step (the trait isn't implemented for&&&&X
or&&&&&X
), so we dereference once to getU = &&&&X
, which matches 1 (withSelf = &&&X
) and the call isRefM::refm(*foo)
-
Z
, doesn't match either step so it is dereferenced once, to getY
, which also doesn't match, so it's dereferenced again, to getX
, which doesn't match 1, but does match after autorefing, so the call isRefM::refm(&**foo)
. -
&&A
, the 1. doesn't match and neither does 2. since the trait is not implemented for&A
(for 1) or&&A
(for 2), so it is dereferenced to&A
, which matches 1., withSelf = A
Suppose we have foo.m()
, and that A
isn't Copy
, if foo
has type:
-
A
, thenU = A
matchesself
directly so the call isM::m(foo)
withSelf = A
-
&A
, then 1. doesn't match, and neither does 2. (neither&A
nor&&A
implement the trait), so it is dereferenced toA
, which does match, butM::m(*foo)
requires takingA
by value and hence moving out offoo
, hence the error. -
&&A
, 1. doesn't match, but autorefing gives&&&A
, which does match, so the call isM::m(&foo)
withSelf = &&&A
.
(This answer is based on the code, and is reasonably close to the (slightly outdated) README. Niko Matsakis, the main author of this part of the compiler/language, also glanced over this answer.)
The Rust reference has a chapter about the method call expression. I copied the most important part below. Reminder: we are talking about an expression recv.m()
, where recv
is called "receiver expression" below.
The first step is to build a list of candidate receiver types. Obtain these by repeatedly dereferencing the receiver expression's type, adding each type encountered to the list, then finally attempting an unsized coercion at the end, and adding the result type if that is successful. Then, for each candidate
T
, add&T
and&mut T
to the list immediately afterT
.For instance, if the receiver has type
Box<[i32;2]>
, then the candidate types will beBox<[i32;2]>
,&Box<[i32;2]>
,&mut Box<[i32;2]>
,[i32; 2]
(by dereferencing),&[i32; 2]
,&mut [i32; 2]
,[i32]
(by unsized coercion),&[i32]
, and finally&mut [i32]
.Then, for each candidate type
T
, search for a visible method with a receiver of that type in the following places:
T
's inherent methods (methods implemented directly onT
[¹]).- Any of the methods provided by a visible trait implemented by
T
. [...]
(Note about [¹]: I actually think this phrasing is wrong. I've opened an issue. Let's just ignore that sentence in the parenthesis.)
Let's go through a few examples from your code in detail! For your examples, we can ignore the part about "unsized coercion" and "inherent methods".
(*X{val:42}).m()
: the receiver expression's type is i32
. We perform these steps:
- Creating list of candidate receiver types:
-
i32
cannot be dereferenced, so we are already done with step 1. List:[i32]
- Next, we add
&i32
and&mut i32
. List:[i32, &i32, &mut i32]
-
- Searching for methods for each candidate receiver type:
- We find
<i32 as M>::m
which has the receiver typei32
. So we are already done.
- We find
So far so easy. Now let's pick a more difficult example: (&&A).m()
. The receiver expression's type is &&A
. We perform these steps:
- Creating list of candidate receiver types:
-
&&A
can be dereferenced to&A
, so we add that to the list.&A
can be dereferenced again, so we also addA
to the list.A
cannot be dereferenced, so we stop. List:[&&A, &A, A]
- Next, for each type
T
in the list, we add&T
and&mut T
immediately afterT
. List:[&&A, &&&A, &mut &&A, &A, &&A, &mut &A, A, &A, &mut A]
-
- Searching for methods for each candidate receiver type:
- There is no method with receiver type
&&A
, so we go to the next type in the list. - We find the method
<&&&A as M>::m
which indeed has the receiver type&&&A
. So we are done.
- There is no method with receiver type
Here are the candidate receiver lists for all of your examples. The type that is enclosed in ⟪x⟫
is the one that "won", i.e. the first type for which a fitting method could be found. Also remember that the first type in the list is always the receiver expression's type. Lastly, I formatted the list in lines of three, but that's just formatting: this list is a flat list.
-
(*X{val:42}).m()
→<i32 as M>::m
[⟪i32⟫, &i32, &mut i32]
-
X{val:42}.m()
→<X as M>::m
[⟪X⟫, &X, &mut X, i32, &i32, &mut i32]
-
(&X{val:42}).m()
→<&X as M>::m
[⟪&X⟫, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&X{val:42}).m()
→<&&X as M>::m
[⟪&&X⟫, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&&X{val:42}).m()
→<&&&X as M>::m
[⟪&&&X⟫, &&&&X, &mut &&&X, &&X, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&&&X{val:42}).m()
→<&&&X as M>::m
[&&&&X, &&&&&X, &mut &&&&X, ⟪&&&X⟫, &&&&X, &mut &&&X, &&X, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&&&&X{val:42}).m()
→<&&&X as M>::m
[&&&&&X, &&&&&&X, &mut &&&&&X, &&&&X, &&&&&X, &mut &&&&X, ⟪&&&X⟫, &&&&X, &mut &&&X, &&X, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(*X{val:42}).refm()
→<i32 as RefM>::refm
[i32, ⟪&i32⟫, &mut i32]
-
X{val:42}.refm()
→<X as RefM>::refm
[X, ⟪&X⟫, &mut X, i32, &i32, &mut i32]
-
(&X{val:42}).refm()
→<X as RefM>::refm
[⟪&X⟫, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&X{val:42}).refm()
→<&X as RefM>::refm
[⟪&&X⟫, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&&X{val:42}).refm()
→<&&X as RefM>::refm
[⟪&&&X⟫, &&&&X, &mut &&&X, &&X, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&&&X{val:42}).refm()
→<&&&X as RefM>::refm
[⟪&&&&X⟫, &&&&&X, &mut &&&&X, &&&X, &&&&X, &mut &&&X, &&X, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
(&&&&&X{val:42}).refm()
→<&&&X as RefM>::refm
[&&&&&X, &&&&&&X, &mut &&&&&X, ⟪&&&&X⟫, &&&&&X, &mut &&&&X, &&&X, &&&&X, &mut &&&X, &&X, &&&X, &mut &&X, &X, &&X, &mut &X, X, &X, &mut X, i32, &i32, &mut i32]
-
Y{val:42}.refm()
→<i32 as RefM>::refm
[Y, &Y, &mut Y, i32, ⟪&i32⟫, &mut i32]
-
Z{val:Y{val:42}}.refm()
→<i32 as RefM>::refm
[Z, &Z, &mut Z, Y, &Y, &mut Y, i32, ⟪&i32⟫, &mut i32]
-
A.m()
→<A as M>::m
[⟪A⟫, &A, &mut A]
-
(&A).m()
→<A as M>::m
[&A, &&A, &mut &A, ⟪A⟫, &A, &mut A]
-
(&&A).m()
→<&&&A as M>::m
[&&A, ⟪&&&A⟫, &mut &&A, &A, &&A, &mut &A, A, &A, &mut A]
-
(&&&A).m()
→<&&&A as M>::m
[⟪&&&A⟫, &&&&A, &mut &&&A, &&A, &&&A, &mut &&A, &A, &&A, &mut &A, A, &A, &mut A]
-
A.refm()
→<A as RefM>::refm
[A, ⟪&A⟫, &mut A]
-
(&A).refm()
→<A as RefM>::refm
[⟪&A⟫, &&A, &mut &A, A, &A, &mut A]
-
(&&A).refm()
→<A as RefM>::refm
[&&A, &&&A, &mut &&A, ⟪&A⟫, &&A, &mut &A, A, &A, &mut A]
-
(&&&A).refm()
→<&&&A as RefM>::refm
[&&&A, ⟪&&&&A⟫, &mut &&&A, &&A, &&&A, &mut &&A, &A, &&A, &mut &A, A, &A, &mut A]