Python: create a pandas data frame from a list
I am using the following code to create a data frame from a list:
test_list = ['a','b','c','d']
df_test = pd.DataFrame.from_records(test_list, columns=['my_letters'])
df_test
The above code works fine. Then I tried the same approach for another list:
import pandas as pd
q_list = ['112354401', '116115526', '114909312', '122425491', '131957025', '111373473']
df1 = pd.DataFrame.from_records(q_list, columns=['q_data'])
df1
But it gave me the following errors this time:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-24-99e7b8e32a52> in <module>()
1 import pandas as pd
2 q_list = ['112354401', '116115526', '114909312', '122425491', '131957025', '111373473']
----> 3 df1 = pd.DataFrame.from_records(q_list, columns=['q_data'])
4 df1
/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in from_records(cls, data, index, exclude, columns, coerce_float, nrows)
1021 else:
1022 arrays, arr_columns = _to_arrays(data, columns,
-> 1023 coerce_float=coerce_float)
1024
1025 arr_columns = _ensure_index(arr_columns)
/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in _to_arrays(data, columns, coerce_float, dtype)
5550 data = lmap(tuple, data)
5551 return _list_to_arrays(data, columns, coerce_float=coerce_float,
-> 5552 dtype=dtype)
5553
5554
/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in _list_to_arrays(data, columns, coerce_float, dtype)
5607 content = list(lib.to_object_array(data).T)
5608 return _convert_object_array(content, columns, dtype=dtype,
-> 5609 coerce_float=coerce_float)
5610
5611
/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in _convert_object_array(content, columns, coerce_float, dtype)
5666 # caller's responsibility to check for this...
5667 raise AssertionError('%d columns passed, passed data had %s '
-> 5668 'columns' % (len(columns), len(content)))
5669
5670 # provide soft conversion of object dtypes
AssertionError: 1 columns passed, passed data had 9 columns
Why would the same approach work for one list but not another? Any idea what might be wrong here? Thanks a lot!
DataFrame.from_records
treats string as a character list. so it needs as many columns as length of string.
You could simply use the DataFrame
constructor.
In [3]: pd.DataFrame(q_list, columns=['q_data'])
Out[3]:
q_data
0 112354401
1 116115526
2 114909312
3 122425491
4 131957025
5 111373473
In[20]: test_list = [['a','b','c'], ['AA','BB','CC']]
In[21]: pd.DataFrame(test_list, columns=['col_A', 'col_B', 'col_C'])
Out[21]:
col_A col_B col_C
0 a b c
1 AA BB CC
In[22]: pd.DataFrame(test_list, index=['col_low', 'col_up']).T
Out[22]:
col_low col_up
0 a AA
1 b BB
2 c CC
If you want to create a DataFrame from multiple lists you can simply zip the lists. This returns a 'zip' object. So you convert back to a list.
mydf = pd.DataFrame(list(zip(lstA, lstB)), columns = ['My List A', 'My List B'])