How to load a module from code in a string?
I have some code in the form of a string and would like to make a module out of it without writing to disk.
When I try using imp and a StringIO object to do this, I get:
>>> imp.load_source('my_module', '', StringIO('print "hello world"'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: load_source() argument 3 must be file, not instance
>>> imp.load_module('my_module', StringIO('print "hello world"'), '', ('', '', 0))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: load_module arg#2 should be a file or None
How can I create the module without having an actual file? Alternatively, how can I wrap a StringIO in a file without writing to disk?
UPDATE:
NOTE: This issue is also a problem in python3.
The code I'm trying to load is only partially trusted. I've gone through it with ast and determined that it doesn't import anything or do anything I don't like, but I don't trust it enough to run it when I have local variables running around that could get modified, and I don't trust my own code to stay out of the way of the code I'm trying to import.
I created an empty module that only contains the following:
def load(code):
# Delete all local variables
globals()['code'] = code
del locals()['code']
# Run the code
exec(globals()['code'])
# Delete any global variables we've added
del globals()['load']
del globals()['code']
# Copy k so we can use it
if 'k' in locals():
globals()['k'] = locals()['k']
del locals()['k']
# Copy the rest of the variables
for k in locals().keys():
globals()[k] = locals()[k]
Then you can import mymodule
and call mymodule.load(code)
. This works for me because I've ensured that the code I'm loading does not use globals
. Also, the global
keyword is only a parser directive and can't refer to anything outside of the exec.
This really is way too much work to import
the module without writing to disk, but if you ever want to do this, I believe it's the best way.
Here is how to import a string as a module (Python 2.x):
import sys,imp
my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec my_code in mymodule.__dict__
In Python 3, exec is a function, so this should work:
import sys,imp
my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec(my_code, mymodule.__dict__)
Now access the module attributes (and functions, classes etc) as:
print(mymodule.a)
>>> 5
To ignore any next attempt to import, add the module to sys
:
sys.modules['mymodule'] = mymodule
imp.new_module
is deprecated since python 3.4
but the short solution from schlenk using types.ModuleType is still working in python 3.7
imp.new_module
was replaced with importlib.util.module_from_spec
importlib.util.module_from_spec is preferred over using
types.ModuleType
to create a new module as spec is used to set as many import-controlled attributes on the module as possible.importlib.util.spec_from_loader uses available loader APIs, such as
InspectLoader.is_package()
, to fill in any missing information on the spec.
these module attributes are __builtins__
, __doc__
, __loader__
, __name__
, __package__
, __spec__
import sys, importlib
my_name = 'my_module'
my_spec = importlib.util.spec_from_loader(my_name, loader=None)
my_module = importlib.util.module_from_spec(my_spec)
my_code = '''
def f():
print('f says hello')
'''
exec(my_code, my_module.__dict__)
sys.modules['my_module'] = my_module
my_module.f()
You could simply create a Module object and stuff it into sys.modules and put your code inside.
Something like:
import sys
from types import ModuleType
mod = ModuleType('mymodule')
sys.modules['mymodule'] = mod
exec(mycode, mod.__dict__)
If the code for the module is in a string, you can forgo using StringIO
and use it directly with exec
, as illustrated below with a file named dynmodule.py
.
Works in Python 2 & 3.
from __future__ import print_function
class _DynamicModule(object):
def load(self, code):
execdict = {'__builtins__': None} # optional, to increase safety
exec(code, execdict)
keys = execdict.get(
'__all__', # use __all__ attribute if defined
# else all non-private attributes
(key for key in execdict if not key.startswith('_')))
for key in keys:
setattr(self, key, execdict[key])
# replace this module object in sys.modules with empty _DynamicModule instance
# see Stack Overflow question:
# https://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name
import sys as _sys
_ref, _sys.modules[__name__] = _sys.modules[__name__], _DynamicModule()
if __name__ == '__main__':
import dynmodule # name of this module
import textwrap # for more readable code formatting in sample string
# string to be loaded can come from anywhere or be generated on-the-fly
module_code = textwrap.dedent("""\
foo, bar, baz = 5, 8, 2
def func():
return foo*bar + baz
__all__ = 'foo', 'bar', 'func' # 'baz' not included
""")
dynmodule.load(module_code) # defines module's contents
print('dynmodule.foo:', dynmodule.foo)
try:
print('dynmodule.baz:', dynmodule.baz)
except AttributeError:
print('no dynmodule.baz attribute was defined')
else:
print('Error: there should be no dynmodule.baz module attribute')
print('dynmodule.func() returned:', dynmodule.func())
Output:
dynmodule.foo: 5
no dynmodule.baz attribute was defined
dynmodule.func() returned: 42
Setting the '__builtins__'
entry to None
in the execdict
dictionary prevents the code from directly executing any built-in functions, like __import__
, and so makes running it safer. You can ease that restriction by selectively adding things to it you feel are OK and/or required.
It's also possible to add your own predefined utilities and attributes which you'd like made available to the code thereby creating a custom execution context for it to run in. That sort of thing can be useful for implementing a "plug-in" or other user-extensible architecture.