What's the difference between a hash and hash reference in Perl?

I would like to properly understand hashes in Perl. I've had to use Perl intermittently for quite some time and mostly whenever I need to do it, it's mostly related to text processing.

And everytime, I have to deal with hashes, it gets messed up. I find the syntax very cryptic for hashes

A good explanation of hashes and hash references, their differences, when they are required etc. would be much appreciated.


Solution 1:

A simple hash is close to an array. Their initializations even look similar. First the array:

@last_name = (
  "Ward",   "Cleaver",
  "Fred",   "Flintstone",
  "Archie", "Bunker"
);

Now let's represent the same information with a hash (aka associative array):

%last_name = (
  "Ward",   "Cleaver",
  "Fred",   "Flintstone",
  "Archie", "Bunker"
);

Although they have the same name, the array @last_name and the hash %last_name are completely independent.

With the array, if we want to know Archie's last name, we have to perform a linear search:

my $lname;
for (my $i = 0; $i < @last_name; $i += 2) {
  $lname = $last_name[$i+1] if $last_name[$i] eq "Archie";
}
print "Archie $lname\n";

With the hash, it's much more direct syntactically:

print "Archie $last_name{Archie}\n";

Say we want to represent information with only slightly richer structure:

  • Cleaver (last name)
    • Ward (first name)
    • June (spouse's first name)
  • Flintstone
    • Fred
    • Wilma
  • Bunker
    • Archie
    • Edith

Before references came along, flat key-value hashes were about the best we could do, but references allow

my %personal_info = (
    "Cleaver", {
        "FIRST",  "Ward",
        "SPOUSE", "June",
    },
    "Flintstone", {
        "FIRST",  "Fred",
        "SPOUSE", "Wilma",
    },
    "Bunker", {
        "FIRST",  "Archie",
        "SPOUSE", "Edith",
    },
);

Internally, the keys and values of %personal_info are all scalars, but the values are a special kind of scalar: hash references, created with {}. The references allow us to simulate "multi-dimensional" hashes. For example, we can get to Wilma via

$personal_info{Flintstone}->{SPOUSE}

Note that Perl allows us to omit arrows between subscripts, so the above is equivalent to

$personal_info{Flintstone}{SPOUSE}

That's a lot of typing if you want to know more about Fred, so you might grab a reference as sort of a cursor:

$fred = $personal_info{Flintstone};
print "Fred's wife is $fred->{SPOUSE}\n";

Because $fred in the snippet above is a hashref, the arrow is necessary. If you leave it out but wisely enabled use strict to help you catch these sorts of errors, the compiler will complain:

Global symbol "%fred" requires explicit package name at ...

Perl references are similar to pointers in C and C++, but they can never be null. Pointers in C and C++ require dereferencing and so do references in Perl.

C and C++ function parameters have pass-by-value semantics: they're just copies, so modifications don't get back to the caller. If you want to see the changes, you have to pass a pointer. You can get this effect with references in Perl:

sub add_barney {
    my($personal_info) = @_;

    $personal_info->{Rubble} = {
        FIRST  => "Barney",
        SPOUSE => "Betty",
    };
}

add_barney \%personal_info;

Without the backslash, add_barney would have gotten a copy that's thrown away as soon as the sub returns.

Note also the use of the "fat comma" (=>) above. It autoquotes the string on its left and makes hash initializations less syntactically noisy.

Solution 2:

The following demonstrates how you can use a hash and a hash reference:

my %hash = (
    toy    => 'aeroplane',
    colour => 'blue',
);
print "I have an ", $hash{toy}, " which is coloured ", $hash{colour}, "\n";

my $hashref = \%hash;
print "I have an ", $hashref->{toy}, " which is coloured ", $hashref->{colour}, "\n";

Also see perldoc perldsc.

Solution 3:

A hash is a basic data type in Perl. It uses keys to access its contents.

A hash ref is an abbreviation to a reference to a hash. References are scalars, that is simple values. It is a scalar value that contains essentially, a pointer to the actual hash itself.

Link: difference between hash and hash ref in perl - Ubuntu Forums

A difference is also in the syntax for deleting. Like C, perl works like this for Hashes:

delete $hash{$key};

and for Hash References

delete $hash_ref->{$key};

The Perl Hash Howto is a great resource to understand Hashes versus Hash with Hash References

There is also another link here that has more information on perl and references.

Solution 4:

See perldoc perlreftut which is also accessible on your own computer's command line.

A reference is a scalar value that refers to an entire array or an entire hash (or to just about anything else). Names are one kind of reference that you're already familiar with. Think of the President of the United States: a messy, inconvenient bag of blood and bones. But to talk about him, or to represent him in a computer program, all you need is the easy, convenient scalar string "Barack Obama".

References in Perl are like names for arrays and hashes. They're Perl's private, internal names, so you can be sure they're unambiguous. Unlike "Barack Obama", a reference only refers to one thing, and you always know what it refers to. If you have a reference to an array, you can recover the entire array from it. If you have a reference to a hash, you can recover the entire hash. But the reference is still an easy, compact scalar value.