How can I speed up my Perl program?
This is really two questions, but they are so similar, and to keep it simple, I figured I'd just roll them together:
Firstly: Given an established Perl project, what are some decent ways to speed it up beyond just plain in-code optimization?
Secondly: When writing a program from scratch in Perl, what are some good ways to greatly improve performance?
For the first question, imagine you are handed a decently written project and you need to improve performance, but you can't seem to get much of a gain through refactoring/optimization. What would you do to speed it up in this case short of rewriting it in something like C?
Please stay away from general optimization techniques unless they are Perl specific.
I asked this about Python earlier, and I figured it might be good to do it for other languages (I'm especially curious if there are corollaries to psycho and pyrex for Perl).
Solution 1:
Please remember the rules of Optimization Club:
- The first rule of Optimization Club is, you do not Optimize.
- The second rule of Optimization Club is, you do not Optimize without measuring.
- If your app is running faster than the underlying transport protocol, the optimization is over.
- One factor at a time.
- No marketroids, no marketroid schedules.
- Testing will go on as long as it has to.
- If this is your first night at Optimization Club, you have to write a test case.
So, assuming you actually have working code, run your program under Devel::NYTProf.
Find the bottlenecks. Then come back here to tell us what they are.
If you don't have working code, get it working first. The single biggest optimization you will ever make is going from non-working to working.
Solution 2:
Andy has already mentioned Devel::NYTProf. It's awesome. Really, really awesome. Use it.
If for some reason you can't use Devel::NYTProf
, then you can fall back to good old Devel::DProf, which has come standard with Perl for a long time now. If you have true functions (in the mathematical sense) which take a long time to calculate (eg, Fibonacci numbers), then you may find Memoize provides some speed improvement.
A lot of poor performance comes from inappropriate data structures and algorithms. A good course in computer science can help immensely here. If you have two ways of doing things, and would like to compare their performance, the Benchmark module can also prove useful.
The following Perl Tips may also prove useful here:
- Sorting with expensive comparisons
- Profiling with Devel::DProf
- Big-O notation and algorithmic complexity
- Searching for items in a large list
- Benchmarking basics
- Memoizing
Disclaimer: I wrote some of the resources above, so I may be biased towards them.
Solution 3:
There are many things that you might improve on, so you first have to figure out what's slow. Others have already answered that question. I talk about this a bit in Mastering Perl too.
An incomplete list of things to think about as you are writing new code:
Profile with something like Devel::NYTProf to see where you are spending most of your time in the code. Sometimes that's surprising and easy to fix. Mastering Perl has a lot of advice about that.
Perl has to compile the source every time and compilation can be slow. It has to find all the files and so on. See, for instance, "A Timely Start", by Jean-Louis Leroy, where he speeds everything up just by optimizing module locations in
@INC
. If your start-up costs are expensive and unavoidable, you might also look at persistent perls, like pperl, mod_perl, and so on.Look at some of the modules you use. Do they have long chains of dependencies just to do simple things? Sure, we don't like re-invention, but if the wheel you want to put on your car also comes with three boats, five goats, and a cheeseburger, maybe you want to build your own wheel (or find a different one).
Method calls can be expensive. In the Perl::Critic test suite, for instance, its calls to
isa
slows things down. It's not something that you can really avoid in all cases, but it is something to keep in mind. Someone had a great quote that went something like "No one minds giving up a factor of 2; it's when you have ten people doing it that it's bad." :) Perl v5.22 has some performance improvements for this.If you're calling the same expensive methods over and over again but getting the same answers, something like Memoize might be for you. It's a proxy for the method call. If it's really a function (meaning, same input gives same output with no side effects), you don't really need to call it repeatedly.
Modules such as Apache::DBI can reuse database handles for you to avoid the expensive opening of database connections. It's really simple code, so looking inside can show you how to do that even if you aren't using Apache.
Perl doesn't do tail recursion optimization for you, so don't come over from Lisp thinking you're going to make these super fast recursive algorithms. You can turn those into iterative solutions easily (and we talk about that in Intermediate Perl.
Look at your regexes. Lots of open ended quantifiers (e.g.
.*
) can lead to lots of backtracking. Check out Jeffrey Freidl'sMastering Regular Expressions for all the gory details (and across several languages). Also check out his regex website.Know how your perl is compiled. Do you really need threading and
DDEBUGGING
? Those slow you down a bit. Check out the perlbench utility to compare different perl binaries.Benchmark your application against different Perls. Some newer versions have speedups, but also some older versions can be faster for limited sets of operations. I don't have particular advice since I don't know what you are doing.
Spread out the work. Can you do some asynchronous work in other processes or on remote computers? Let your program work on other things as someone else figures out some subproblems. Perl has several asynchronous and load shifting modules. Beware, though, that the scaffolding to do that stuff well might lose any benefit to doing it.
Solution 4:
Without having to rewrite large chunks, you can use Inline::C to convert any single, slow subroutine to C. Or directly use XS. It's also possible to incrementally convert subs with XS. PPI/PPI::XS does this, for example.
But moving to another language is always a last resort. Maybe you should get an expert Perl programmer to look at your code? More likely than not, (s)he'd spot some peculiarity that's seriously hurting your performance. Other than that, profile your code. Remember, there is no silver bullet.
With regards to psyco and pyrex: No, there's no equivalent for Perl.