Python import coding style
The (previously) top-voted answer to this question is nicely formatted but absolutely wrong about performance. Let me demonstrate
Performance
Top Import
import random
def f():
L = []
for i in xrange(1000):
L.append(random.random())
for i in xrange(1000):
f()
$ time python import.py
real 0m0.721s
user 0m0.412s
sys 0m0.020s
Import in Function Body
def f():
import random
L = []
for i in xrange(1000):
L.append(random.random())
for i in xrange(1000):
f()
$ time python import2.py
real 0m0.661s
user 0m0.404s
sys 0m0.008s
As you can see, it can be more efficient to import the module in the function. The reason for this is simple. It moves the reference from a global reference to a local reference. This means that, for CPython at least, the compiler will emit LOAD_FAST
instructions instead of LOAD_GLOBAL
instructions. These are, as the name implies, faster. The other answerer artificially inflated the performance hit of looking in sys.modules
by importing on every single iteration of the loop.
As a rule, it's best to import at the top but performance is not the reason if you are accessing the module a lot of times. The reasons are that one can keep track of what a module depends on more easily and that doing so is consistent with most of the rest of the Python universe.
This does have a few disadvantages.
Testing
On the off chance you want to test your module through runtime modification, it may make it more difficult. Instead of doing
import mymodule
mymodule.othermodule = module_stub
You'll have to do
import othermodule
othermodule.foo = foo_stub
This means that you'll have to patch the othermodule globally, as opposed to just change what the reference in mymodule points to.
Dependency Tracking
This makes it non-obvious what modules your module depends on. This is especially irritating if you use many third party libraries or are re-organizing code.
I had to maintain some legacy code that used imports inline all over the place, it made the code extremely difficult to refactor or repackage.
Notes On Performance
Because of the way python caches modules, there isn't a performance hit. In fact, since the module is in the local namespace, there is a slight performance benefit to importing modules in a function.
Top Import
import random
def f():
L = []
for i in xrange(1000):
L.append(random.random())
for i in xrange(10000):
f()
$ time python test.py
real 0m1.569s
user 0m1.560s
sys 0m0.010s
Import in Function Body
def f():
import random
L = []
for i in xrange(1000):
L.append(random.random())
for i in xrange(10000):
f()
$ time python test2.py
real 0m1.385s
user 0m1.380s
sys 0m0.000s
A few problems with this approach:
- It's not immediately obvious when opening the file which modules it depends on.
- It will confuse programs that have to analyze dependencies, such as
py2exe
,py2app
etc. - What about modules that you use in many functions? You will either end up with a lot of redundant imports or you'll have to have some at the top of the file and some inside functions.
So... the preferred way is to put all imports at the top of the file. I've found that if my imports get hard to keep track of, it usually means I have too much code that I'd be better off splitting it into two or more files.
Some situations where I have found imports inside functions to be useful:
- To deal with circular dependencies (if you really really can't avoid them)
- Platform specific code
Also: putting imports inside each function is actually not appreciably slower than at the top of the file. The first time each module is loaded it is put into sys.modules
, and each subsequent import costs only the time to look up the module, which is fairly fast (it is not reloaded).
Another useful thing to note is that the from module import *
syntax inside of a function has been removed in Python 3.0.
There is a brief mention of it under "Removed Syntax" here:
http://docs.python.org/3.0/whatsnew/3.0.html