How to load a module from code in a string?

I have some code in the form of a string and would like to make a module out of it without writing to disk.

When I try using imp and a StringIO object to do this, I get:

>>> imp.load_source('my_module', '', StringIO('print "hello world"'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: load_source() argument 3 must be file, not instance
>>> imp.load_module('my_module', StringIO('print "hello world"'), '', ('', '', 0))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: load_module arg#2 should be a file or None

How can I create the module without having an actual file? Alternatively, how can I wrap a StringIO in a file without writing to disk?

UPDATE:

NOTE: This issue is also a problem in python3.

The code I'm trying to load is only partially trusted. I've gone through it with ast and determined that it doesn't import anything or do anything I don't like, but I don't trust it enough to run it when I have local variables running around that could get modified, and I don't trust my own code to stay out of the way of the code I'm trying to import.

I created an empty module that only contains the following:

def load(code):
    # Delete all local variables
    globals()['code'] = code
    del locals()['code']

    # Run the code
    exec(globals()['code'])

    # Delete any global variables we've added
    del globals()['load']
    del globals()['code']

    # Copy k so we can use it
    if 'k' in locals():
        globals()['k'] = locals()['k']
        del locals()['k']

    # Copy the rest of the variables
    for k in locals().keys():
        globals()[k] = locals()[k]

Then you can import mymodule and call mymodule.load(code). This works for me because I've ensured that the code I'm loading does not use globals. Also, the global keyword is only a parser directive and can't refer to anything outside of the exec.

This really is way too much work to import the module without writing to disk, but if you ever want to do this, I believe it's the best way.

Here is how to import a string as a module (Python 2.x):

import sys,imp

my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec my_code in mymodule.__dict__

In Python 3, exec is a function, so this should work:

import sys,imp

my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec(my_code, mymodule.__dict__)

Now access the module attributes (and functions, classes etc) as:

print(mymodule.a)
>>> 5

To ignore any next attempt to import, add the module to sys:

sys.modules['mymodule'] = mymodule

imp.new_module is deprecated since python 3.4

but the short solution from schlenk using types.ModuleType is still working in python 3.7

imp.new_module was replaced with importlib.util.module_from_spec

importlib.util.module_from_spec is preferred over using types.ModuleType to create a new module as spec is used to set as many import-controlled attributes on the module as possible.

importlib.util.spec_from_loader uses available loader APIs, such as InspectLoader.is_package(), to fill in any missing information on the spec.

these module attributes are __builtins__, __doc__, __loader__, __name__, __package__, __spec__

import sys, importlib

my_name = 'my_module'
my_spec = importlib.util.spec_from_loader(my_name, loader=None)

my_module = importlib.util.module_from_spec(my_spec)

my_code = '''
def f():
    print('f says hello')
'''
exec(my_code, my_module.__dict__)
sys.modules['my_module'] = my_module

my_module.f()

You could simply create a Module object and stuff it into sys.modules and put your code inside.

Something like:

import sys
from types import ModuleType
mod = ModuleType('mymodule')
sys.modules['mymodule'] = mod
exec(mycode, mod.__dict__)

If the code for the module is in a string, you can forgo using StringIO and use it directly with exec, as illustrated below with a file named dynmodule.py. Works in Python 2 & 3.

from __future__ import print_function

class _DynamicModule(object):
    def load(self, code):
        execdict = {'__builtins__': None}  # optional, to increase safety
        exec(code, execdict)
        keys = execdict.get(
            '__all__',  # use __all__ attribute if defined
            # else all non-private attributes
            (key for key in execdict if not key.startswith('_')))
        for key in keys:
            setattr(self, key, execdict[key])

# replace this module object in sys.modules with empty _DynamicModule instance
# see Stack Overflow question:
# https://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name
import sys as _sys
_ref, _sys.modules[__name__] = _sys.modules[__name__], _DynamicModule()

if __name__ == '__main__':
    import dynmodule  # name of this module
    import textwrap  # for more readable code formatting in sample string

    # string to be loaded can come from anywhere or be generated on-the-fly
    module_code = textwrap.dedent("""\
        foo, bar, baz = 5, 8, 2

        def func():
            return foo*bar + baz

        __all__ = 'foo', 'bar', 'func'  # 'baz' not included
        """)

    dynmodule.load(module_code)  # defines module's contents

    print('dynmodule.foo:', dynmodule.foo)
    try:
        print('dynmodule.baz:', dynmodule.baz)
    except AttributeError:
        print('no dynmodule.baz attribute was defined')
    else:
        print('Error: there should be no dynmodule.baz module attribute')
    print('dynmodule.func() returned:', dynmodule.func())

Output:

dynmodule.foo: 5
no dynmodule.baz attribute was defined
dynmodule.func() returned: 42

Setting the '__builtins__' entry to None in the execdict dictionary prevents the code from directly executing any built-in functions, like __import__, and so makes running it safer. You can ease that restriction by selectively adding things to it you feel are OK and/or required.

It's also possible to add your own predefined utilities and attributes which you'd like made available to the code thereby creating a custom execution context for it to run in. That sort of thing can be useful for implementing a "plug-in" or other user-extensible architecture.

How to load a module from code in a string?

Related

Recent Posts