memory error in python

Traceback (most recent call last):
File "/run-1341144766-1067082874/solution.py", line 27, in 
main()
File "/run-1341144766-1067082874/solution.py", line 11, in main
if len(s[i:j+1]) > 0:
MemoryError
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 64, in apport_excepthook
from apport.fileutils import likely_packaged, get_recent_crashes
File "/usr/lib/python2.7/dist-packages/apport/__init__.py", line 1, in 
from apport.report import Report
MemoryError

Original exception was:
Traceback (most recent call last):
File "/run-1341144766-1067082874/solution.py", line 27, in 
main()
File "/run-1341144766-1067082874/solution.py", line 11, in main
if len(s[i:j+1]) > 0:
MemoryError

The above errors appeared when I tried to run the following program. Can someone explain what is a memory error, and how to overcome this problem? . The program takes strings as input and finds all possible sub strings and creates a set(in a lexicographical order) out of it and it should print the value at the respective index asked by the user otherwise it should print 'Invalid'

def main():
    no_str = int(raw_input())
    sub_strings= []
    for k in xrange(0,no_str):
        s = raw_input()
        a=len(s)
        for i in xrange(0, a):
            for j in xrange(0, a):
                if j >= i:
                    if len(s[i:j+1]) > 0:
                        sub_strings.append(s[i:j+1])
    sub_strings = list(set(sub_strings))
    sub_strings.sort()
    queries= int(raw_input())
    resul = []
    for i in xrange(0,queries):
        resul.append(int(raw_input()))
    for p in resul:
        try:
            print sub_strings[p-1]
        except IndexError:
            print 'INVALID'


if __name__ == "__main__":
   main()

Solution 1:

If you get an unexpected MemoryError and you think you should have plenty of RAM available, it might be because you are using a 32-bit python installation.

The easy solution, if you have a 64-bit operating system, is to switch to a 64-bit installation of python.

The issue is that 32-bit python only has access to ~4GB of RAM. This can shrink even further if your operating system is 32-bit, because of the operating system overhead.

You can learn more about why 32-bit operating systems are limited to ~4GB of RAM here: https://superuser.com/questions/372881/is-there-a-technical-reason-why-32-bit-windows-is-limited-to-4gb-of-ram

Solution 2:

This one here:

s = raw_input()
a=len(s)
for i in xrange(0, a):
    for j in xrange(0, a):
        if j >= i:
            if len(s[i:j+1]) > 0:
                sub_strings.append(s[i:j+1])

seems to be very inefficient and expensive for large strings.

Better do

for i in xrange(0, a):
    for j in xrange(i, a): # ensures that j >= i, no test required
        part = buffer(s, i, j+1-i) # don't duplicate data
        if len(part) > 0:
            sub_Strings.append(part)

A buffer object keeps a reference to the original string and start and length attributes. This way, no unnecessary duplication of data occurs.

A string of length l has l*l/2 sub strings of average length l/2, so the memory consumption would roughly be l*l*l/4. With a buffer, it is much smaller.

Note that buffer() only exists in 2.x. 3.x has memoryview(), which is utilized slightly different.

Even better would be to compute the indexes and cut out the substring on demand.