How can I maintain the order of keys I add to a Perl hash?
Hashes are not ordered, but as usual, CPAN offers a solution: Tie::IxHash
use Tie::IxHash;
my %count;
tie %count, 'Tie::IxHash';
while ($line = <DATA>) {
$count{$line}++ if ( $line =~ /\S/ );
}
while( my( $key, $value)= each %count) {
print "$key\t $value";
}
Data in a hash table is stored in order of the keys' hash code, which for most purposes is like a random order. You also want to store the order of the first appearance of each key. Here's one way to approach this problem:
my (%count, $line, @display_order);
while ($line = <DATA>) {
chomp $line; # strip the \n off the end of $line
if ($line =~ /\S/) {
if ($count{$line}++ == 0) {
# this is the first time we have seen the key "$line"
push @display_order, $line;
}
}
}
# now @display_order holds the keys of %count, in the order of first appearance
foreach my $key (@display_order)
{
print "$key\t $count{$key}\n";
}
From perlfaq4's answer to "How can I make my hash remember the order I put elements into it?"
How can I make my hash remember the order I put elements into it?
Use the Tie::IxHash from CPAN.
use Tie::IxHash;
tie my %myhash, 'Tie::IxHash';
for (my $i=0; $i<20; $i++) {
$myhash{$i} = 2*$i;
}
my @keys = keys %myhash;
# @keys = (0,1,2,3,...)
Simply:
my (%count, @order);
while(<DATA>) {
chomp;
push @order, $_ unless $count{$_}++;
}
print "$_ $count{$_}\n" for @order;
__DATA__
a
b
e
a
c
d
a
c
d
b
Or as oneliner
perl -nlE'$c{$_}++or$o[@o]=$_}{say"$_ $c{$_}"for@o'<<<$'a\nb\ne\na\nc\nd\na\nc\nd\nb'
Another option is David Golden's (@xdg) simple pure perl Hash::Ordered
module. You gain order but it is slower since the hash becomes an object behind the scenes and you use methods for accessing and modifying hash elements.
There are probably benchmarks that can quantify just how much slower the module is than regular hashes but it's a cool way to work with key/value data structures in small scripts and fast enough for me in that sort of application. The documentation mentions several other approaches to ordering a hash as well.