Testing if a pandas DataFrame exists
Option 1 (my preferred option)
This is @Ami Tavory's
Please select his answer if you like this approach
It is very idiomatic python to initialize a variable with None
then check for None
prior to doing something with that variable.
df1 = None
if df1 is not None:
print df1.head()
Option 2
However, setting up an empty dataframe isn't at all a bad idea.
df1 = pd.DataFrame()
if not df1.empty:
print df1.head()
Option 3
Just try it.
try:
print df1.head()
# catch when df1 is None
except AttributeError:
pass
# catch when it hasn't even been defined
except NameError:
pass
Timing
When df1
is in initialized state or doesn't exist at all
When df1
is a dataframe with something in it
df1 = pd.DataFrame(np.arange(25).reshape(5, 5), list('ABCDE'), list('abcde'))
df1
In my code, I have several variables which can either contain a pandas DataFrame or nothing at all
The Pythonic way of indicating "nothing" is via None
, and for checking "not nothing" via
if df1 is not None:
...
I am not sure how critical time is here, but since you measured things:
In [82]: t = timeit.Timer('if x is not None: pass', setup='x=None')
In [83]: t.timeit()
Out[83]: 0.022536039352416992
In [84]: t = timeit.Timer('if isinstance(x, type(None)): pass', setup='x=None')
In [85]: t.timeit()
Out[85]: 0.11571192741394043
So checking that something is not None
, is also faster than the isinstance
alternative.