Python Performance - have you ever had to rewrite in something else? [closed]
Has anyone ever had code in Python, that turned out not to perform fast enough?
I mean, you were forced to choose another language because of it?
We are investigating using Python for a couple of larger projects, and my feeling is that in most cases, Python is plenty fast enough for most scenarios (compared to say, Java) because it relies on optimized C routines.
I wanted to see if people had instances where they started out in Python, but ended up having to go with something else because of performance.
Thanks.
Yes, I have. I wrote a row-count program for a binary (length-prefixed rather than delimited) bcp output file once and ended up having to redo it in C because the python one was too slow. This program was quite small (it only took a couple of days to re-write it in C), so I didn't bother to try and build a hybrid application (python glue with central routines written in C) but this would also have been a viable route.
A larger application with performance critical bits can be written in a combination of C and a higher level language. You can write the performance-critical parts in C with an interface to Python for the rest of the system. SWIG, Pyrex or Boost.Python (if you're using C++) all provide good mechanisms to do the plumbing for your Python interface. The C API for python is more complex than that for Tcl or Lua, but isn't infeasible to build by hand. For an example of a hand-built Python/C API, check out cx_Oracle.
This approach has been used on quite a number of successful applications going back as far as the 1970s (that I am aware of). Mozilla was substantially written in Javascript around a core engine written in C. Several CAD packages, Interleaf (a technical document publishing system) and of course EMACS are substantially written in LISP with a central C, assembly language or other core. Quite a few commercial and open-source applications (e.g. Chandler or Sungard Front Arena) use embedded Python interpreters and implement substantial parts of the application in Python.
EDIT: In rsponse to Dutch Masters' comment, keeping someone with C or C++ programming skills on the team for a Python project gives you the option of writing some of the application for speed. The areas where you can expect to get a significant performance gain are where the application does something highly iterative over a large data structure or large volume of data. In the case of the row-counter above it had to inhale a series of files totalling several gigabytes and go through a process where it read a varying length prefix and used that to determine the length of the data field. Most of the fields were short (just a few bytes long). This was somewhat bit-twiddly and very low level and iterative, which made it a natural fit for C.
Many of the python libraries such as numpy, cElementTree or cStringIO make use of an optimised C or FORTRAN core with a python API that facilitates working with data in aggregate. For example, numpy has matrix data structures and operations written in C which do all the hard work and a Python API that provides services at the aggregate level.
This is a much more difficult question to answer than people are willing to admit.
For example, it may be that I am able to write a program that performs better in Python than it does in C. The fallacious conclusion from that statement is "Python is therefore faster than C". In reality, it may be because I have much more recent experience in Python and its best practices and standard libraries.
In fact no one can really answer your question unless they are certain that they can create an optimal solution in both languages, which is unlikely. In other words "My C solution was faster than my Python solution" is not the same as "C is faster than Python"
I'm willing to bet that Guido Van Rossum could have written Python solutions for adam and Dustin's problems that performed quite well.
My rule of thumb is that unless you are writing the sort of application that requires you to count clock cycles, you can probably achieve acceptable performance in Python.
Adding my $0.02 for the record.
My work involves developing numeric models that run over 100's of gigabytes of data. The hard problems are in coming up with a revenue-generating solution quickly (i.e. time-to-market). To be commercially successful the solution also has to execute quickly (compute the solution in minimal amounts of time).
For us Python has proven to be an excellent choice to develop solutions for the reasons commonly cited: fast development time, language expressiveness, rich libraries, etc. But to meet the execution speed needs we've adopted the 'Hybrid' approach that several responses have already mentioned.
- Using numpy for computationally intense parts. We get within 1.1x to 2.5x the speed of a 'native' C++ solution with numpy with less code, fewer bugs, and shorter development times.
- Pickling (Python's object serialization) intermediate results to minimize re-computation. The nature of our system requires multiple steps over the same data, so we 'memorize' the results and re-use them where possible.
- Profiling and choosing better algorithms. It's been said in other responses, but I'll repeat it: we whip-out cProfile and try to replace hot-spots with a better algorithm. Not applicable in all cases.
- Going to C++. If the above fails then we call a C++ library. We use PyBindGen to write our Python/C++ wrappers. We found it far superior to SWIG, SIP, and Boost.Python as it produces direct Python C API code without an intermediate layer.
Reading this list you might think "What a lot of re-work! I'll just do it in [C/C++/Java/assembler] the first time around and be done with it."
Let me put it into perspective. Using Python we were able to produce a working revenue-generating application in 5 weeks that, in other languages, had previously required 3 months for projects of similar scope. This includes the time needed to optimize the Python parts we found to be slow.
While at uni we were writing a computer vision system for analysing human behaviour based on video clips. We used python because of the excellent PIL, to speed up development and let us get easy access to the image frames we'd extracted from the video for converting to arrays etc.
For 90% of what we wanted it was fine and since the images were reasonably low resolution the speed wasn't bad. However, a few of the processes required some complex pixel-by-pixel computations as well as convolutions which are notoriously slow. For these particular areas we re-wrote the innermost parts of the loops in C and just updated the old Python functions to call the C functions.
This gave us the best of both worlds. We had the ease of data access that python provides, which enabled to develop fast, and then the straight-line speed of C for the most intensive computations.