What reasons are there to prefer glob over readdir (or vice-versa) in Perl?
This question is a spin-off from this one. Some history: when I first learned Perl, I pretty much always used glob
rather than opendir
+ readdir
because I found it easier. Then later various posts and readings suggested that glob
was bad, and so now I pretty much always use readdir
.
After thinking over this recent question I realized that my reasons for one or the other choice may be bunk. So, I'm going to lay out some pros and cons, and I'm hoping that more experienced Perl folks can chime in and clarify. The question in a nutshell is are there compelling reasons to prefer glob
to readdir
or readdir
to glob
(in some or all cases)?
glob
pros:
- No dotfiles (unless you ask for them)
- Order of items is guaranteed
- No need to prepend the directory name onto items manually
- Better name (c'mon -
glob
versusreaddir
is no contest if we're judging by names alone) -
(From ysth's answer; cf.
glob
cons 4 below) Can return non-existent filenames:@deck = glob "{A,K,Q,J,10,9,8,7,6,5,4,3,2}{\x{2660},\x{2665},\x{2666},\x{2663}}";
glob
cons:
- Older versions are just plain broken (but 'older' means pre 5.6, I think, and frankly if you're using pre 5.6 Perl, you have bigger problems)
- Calls
stat
each time (i.e., useless use ofstat
in most cases). - Problems with spaces in directory names (is this still true?)
-
(From brian's answer) Can return filenames that don't exist:
$ perl -le 'print glob "{ab}{cd}"'
readdir
pros:
- (From brian's answer)
opendir
returns a filehandle which you can pass around in your program (and reuse), butglob
simply returns a list - (From brian's answer)
readdir
is a proper iterator and provides functions torewinddir
,seekdir
,telldir
- Faster? (Pure guess based on some of
glob
's features from above. I'm not really worried about this level of optimization anyhow, but it's a theoretical pro.) - Less prone to edge-case bugs than
glob
? - Reads everything (dotfiles too) by default (this is also a con)
- May convince you not to name a file
0
(a con also - see Brad's answer) - Anyone? Bueller? Bueller?
readdir
cons:
- If you don't remember to prepend the directory name, you will get bit when you try to do filetests or copy items or edit items or...
- If you don't remember to
grep
out the.
and..
items, you will get bit when you count items, or try to walk recursively down the file tree or... - Did I mention prepending the directory name? (A sidenote, but my very first post to the Perl Beginners mail list was the classic, "Why does this code involving filetests not work some of the time?" problem related to this gotcha. Apparently, I'm still bitter.)
- Items are returned in no particular order. This means you will often have to remember to sort them in some manner. (This could be a pro if it means more speed, and if it means that you actually think about how and if you need to sort items.) Edit: Horrifically small sample, but on a Mac
readdir
returns items in alphabetical order, case insensitive. On a Debian box and an OpenBSD server, the order is utterly random. I tested the Mac with Apple's built-in Perl (5.8.8) and my own compiled 5.10.1. The Debian box is 5.10.0, as is the OpenBSD machine. I wonder if this is a filesystem issue, rather than Perl? - Reads everything (dotfiles too) by default (this is also a pro)
- Doesn't necessarily deal well with a file named
0
(see pros also - see Brad's answer)
Solution 1:
You missed the most important, biggest difference between them: glob
gives you back a list, but opendir
gives you a directory handle. You can pass that directory handle around to let other objects or subroutines use it. With the directory handle, the subroutine or object doesn't have to know anything about where it came from, who else is using it, and so on:
sub use_any_dir_handle {
my( $dh ) = @_;
rewinddir $dh;
...do some filtering...
return \@files;
}
With the dirhandle, you have a controllable iterator where you can move around with seekdir
, although with glob
you just get the next item.
As with anything though, the costs and benefits only make sense when applied to a certain context. They do not exist outside of a particular use. You have an excellent list of their differences, but I wouldn't classify those differences without knowing what you were trying to do with them.
Some other things to remember:
You can implement your own glob with
opendir
, but not the other way around.glob uses its own wildcard syntax, and that's all you get.
-
glob can return filenames that don't exist:
$ perl -le 'print glob "{ab}{cd}"'
Solution 2:
glob pros: Can return 'filenames' that don't exist:
my @deck = List::Util::shuffle glob "{A,K,Q,J,10,9,8,7,6,5,4,3,2}{\x{2660},\x{2665},\x{2666},\x{2663}}";
while (my @hand = splice @deck,0,13) {
say join ",", @hand;
}
__END__
6♥,8♠,7♠,Q♠,K♣,Q♦,A♣,3♦,6♦,5♥,10♣,Q♣,2♠
2♥,2♣,K♥,A♥,8♦,6♠,8♣,10♠,10♥,5♣,3♥,Q♥,K♦
5♠,5♦,J♣,J♥,J♦,9♠,2♦,8♥,9♣,4♥,10♦,6♣,3♠
3♣,A♦,K♠,4♦,7♣,4♣,A♠,4♠,7♥,J♠,9♥,7♦,9♦