Keep persistent variables in memory between runs of Python script

You can achieve something like this using the reload global function to re-execute your main script's code. You will need to write a wrapper script that imports your main script, asks it for the variable it wants to cache, caches a copy of that within the wrapper script's module scope, and then when you want (when you hit ENTER on stdin or whatever), it calls reload(yourscriptmodule) but this time passes it the cached object such that yourscript can bypass the expensive computation. Here's a quick example.

wrapper.py

import sys
import mainscript

part1Cache = None
if __name__ == "__main__":
    while True:
        if not part1Cache:
            part1Cache = mainscript.part1()
        mainscript.part2(part1Cache)
        print "Press enter to re-run the script, CTRL-C to exit"
        sys.stdin.readline()
        reload(mainscript)

mainscript.py

def part1():
    print "part1 expensive computation running"
    return "This was expensive to compute"

def part2(value):
    print "part2 running with %s" % value

While wrapper.py is running, you can edit mainscript.py, add new code to the part2 function and be able to run your new code against the pre-computed part1Cache.

To keep data in memory, the process must keep running. Memory belongs to the process running the script, NOT to the shell. The shell cannot hold memory for you.

So if you want to change your code and keep your process running, you'll have to reload the modules when they're changed. If any of the data in memory is an instance of a class that changes, you'll have to find a way to convert it to an instance of the new class. It's a bit of a mess. Not many languages were ever any good at this kind of hot patching (Common Lisp comes to mind), and there are a lot of chances for things to go wrong.

If you only want to persist one object (or object graph) for future sessions, the shelve module probably is overkill. Just pickle the object you care about. Do the work and save the pickle if you have no pickle-file, or load the pickle-file if you have one.

import os
import cPickle as pickle

pickle_filepath = "/path/to/picklefile.pickle"

if not os.path.exists(pickle_filepath):
    # Read data set from disk
    with open('mydata', 'r') as in_handle:
        mytext = in_handle.read()
    # Extract relevant results from data set
    mydata = parse_data(mytext)
    result = initial_operations(mydata)
    with open(pickle_filepath, 'w') as pickle_handle:
        pickle.dump(result, pickle_handle)
else:
    with open(pickle_filepath) as pickle_handle:
        result = pickle.load(pickle_handle)

Python's shelve is a persistence solution for pickled (serialized) objects and is file-based. The advantage is that it stores Python objects directly, meaning the API is pretty simple.

If you really want to avoid the disk, the technology you are looking for is a "in-memory database." Several alternatives exist, see this SO question: in-memory database in Python.

Keep persistent variables in memory between runs of Python script

Related

Recent Posts