Where and how is the _ (underscore) variable specified?
Most are aware of _
’s special meaning in IRB as a holder for last return value, but that is not what I'm asking about here.
Instead, I’m asking about _
when used as a variable name in plain-old-Ruby-code. Here it appears to have special behavior, akin to a “don't care variable” (à la Prolog). Here are some useful examples illustrating its unique behavior:
lambda { |x, x| 42 } # SyntaxError: duplicated argument name
lambda { |_, _| 42 }.call(4, 2) # => 42
lambda { |_, _| 42 }.call(_, _) # NameError: undefined local variable or method `_'
lambda { |_| _ + 1 }.call(42) # => 43
lambda { |_, _| _ }.call(4, 2) # 1.8.7: => 2
# 1.9.3: => 4
_ = 42
_ * 100 # => 4200
_, _ = 4, 2; _ # => 2
These were all run in Ruby directly (with puts
s added in)—not IRB—to avoid conflicting with its additional functionality.
This is all a result of my own experimentation though, as I cannot find any documentation on this behavior anywhere (admittedly it's not the easiest thing to search for). Ultimately, I'm curious how all of this works internally so I can better understand exactly what is special about _
. So I’m asking for references to documentation, and, preferably, the Ruby source code (and perhaps RubySpec) that reveal how _
behaves in Ruby.
Note: most of this arose out of this discussion with @Niklas B.
There is some special handling in the source to suppress the "duplicate argument name" error. The error message only appears in shadowing_lvar_gen
inside parse.y
, the 1.9.3 version looks like this:
static ID
shadowing_lvar_gen(struct parser_params *parser, ID name)
{
if (idUScore == name) return name;
/* ... */
and idUScore
is defined in id.c
like this:
REGISTER_SYMID(idUScore, "_");
You'll see similar special handling in warn_unused_var
:
static void
warn_unused_var(struct parser_params *parser, struct local_vars *local)
{
/* ... */
for (i = 0; i < cnt; ++i) {
if (!v[i] || (u[i] & LVAR_USED)) continue;
if (idUScore == v[i]) continue;
rb_compile_warn(ruby_sourcefile, (int)u[i], "assigned but unused variable - %s", rb_id2name(v[i]));
}
}
You'll notice that the warning is suppressed on the second line of the for
loop.
The only special handling of _
that I could find in the 1.9.3 source is above: the duplicate name error is suppressed and the unused variable warning is suppressed. Other than those two things, _
is just a plain old variable like any other. I don't know of any documentation about the (minor) specialness of _
.
In Ruby 2.0, the idUScore == v[i]
test in warn_unused_var
is replaced with a call to is_private_local_id
:
if (is_private_local_id(v[i])) continue;
rb_warn4S(ruby_sourcefile, (int)u[i], "assigned but unused variable - %s", rb_id2name(v[i]));
and is_private_local_id
suppresses warnings for variables that begin with _
:
if (name == idUScore) return 1;
/* ... */
return RSTRING_PTR(s)[0] == '_';
rather than just _
itself. So 2.0 loosens things up a bit.
_
is a valid identifier. Identifiers can't just contain underscores, they can also be an underscore.
_ = o = Object.new
_.object_id == o.object_id
# => true
You can also use it as method names:
def o._; :_ end
o._
# => :_
Of course, it is not exactly a readable name, nor does it pass any information to the reader about what the variable refers to or what the method does.
IRB
, in particular, sets _
to the value of the last expression:
$ irb
> 'asd'
# => "asd"
> _
# => "asd"
As it is in the source code, it simply sets _
to the last value:
@workspace.evaluate self, "_ = IRB.CurrentContext.last_value"
Did some repository exploring. Here's what I found:
On the last lines of the file id.c
, there is the call:
REGISTER_SYMID(idUScore, "_");
grep
ing the source for idUScore
gave me two seemingly relevant results:
- In the
shadowing_lvar_gen
function - In the
warn_unused_var
function
shadowing_lvar_gen
seems to be the mechanism through which the formal parameter of a block replaces a variable of the same name that exists in another scope. It is the function that seems to raise "duplicated argument name" SyntaxError
and the "shadowing outer local variable" warning.
After grep
ing the source for shadowing_lvar_gen
, I found the following on the changelog for Ruby 1.9.3:
Tue Dec 11 01:21:21 2007 Yukihiro Matsumoto
- parse.y (shadowing_lvar_gen): no duplicate error for "_".
Which is likely to be the origin of this line:
if (idUScore == name) return name;
From this, I deduce that in a situation such as proc { |_, _| :x }.call :a, :b
, one _
variable simply shadows the other.
Here's the commit in question. It basically introduced these two lines:
if (!uscore) uscore = rb_intern("_");
if (uscore == name) return;
From a time when idUScore
did not even exist, apparently.