Variable's memory size in Python [duplicate]
I am writing Python code to do some big number calculation, and have serious concern about the memory used in the calculation.
Thus, I want to count every bit of each variable.
For example, I have a variable x, which is a big number, and want to count the number of bits for representing x.
The following code is obviously useless:
x=2**1000
len(x)
Thus, I turn to use the following code:
x=2**1000
len(repr(x))
The variable x is (in decimal) is:
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376
but the above code returns 303
The above long long sequence is of length 302, and so I believe that 303 should be related to the string length only.
So, here comes my original question:
How can I know the memory size of variable x?
One more thing; in C/C++ language, if I define
int z=1;
This means that there are 4 bytes= 32 bits allocated for z, and the bits are arranged as 00..001(31 0's and one 1).
Here, my variable x is huge, I don't know whether it follows the same memory allocation rule?
Solution 1:
Use sys.getsizeof
to get the size of an object, in bytes.
>>> from sys import getsizeof
>>> a = 42
>>> getsizeof(a)
12
>>> a = 2**1000
>>> getsizeof(a)
146
>>>
Note that the size and layout of an object is purely implementation-specific. CPython, for example, may use totally different internal data structures than IronPython. So the size of an object may vary from implementation to implementation.
Solution 2:
Regarding the internal structure of a Python long, check sys.int_info (or sys.long_info for Python 2.7).
>>> import sys
>>> sys.int_info
sys.int_info(bits_per_digit=30, sizeof_digit=4)
Python either stores 30 bits into 4 bytes (most 64-bit systems) or 15 bits into 2 bytes (most 32-bit systems). Comparing the actual memory usage with calculated values, I get
>>> import math, sys
>>> a=0
>>> sys.getsizeof(a)
24
>>> a=2**100
>>> sys.getsizeof(a)
40
>>> a=2**1000
>>> sys.getsizeof(a)
160
>>> 24+4*math.ceil(100/30)
40
>>> 24+4*math.ceil(1000/30)
160
There are 24 bytes of overhead for 0 since no bits are stored. The memory requirements for larger values matches the calculated values.
If your numbers are so large that you are concerned about the 6.25% unused bits, you should probably look at the gmpy2 library. The internal representation uses all available bits and computations are significantly faster for large values (say, greater than 100 digits).