Get a list of field values from Python's sqlite3, not tuples representing rows
It's annoying how Python's sqlite3
module always returns a list of tuples! When I am querying a single column, I would prefer to get a plain list.
e.g. when I execute
SELECT somecol FROM sometable
and call
cursor.fetchall()
it returns
[(u'one',), (u'two',), (u'three',)]
but I'd rather just get
[u'one', u'two', u'three']
Is there a way to do this?
sqlite3.Connection
has a row_factory
attribute.
The documentation states that:
You can change this attribute to a callable that accepts the cursor and the original row as a tuple and will return the real result row. This way, you can implement more advanced ways of returning results, such as returning an object that can also access columns by name.
To return a list of single values from a SELECT
, such as an id
, you can assign a lambda to row_factory
which returns the first indexed value in each row; e.g:
import sqlite3 as db
conn = db.connect('my.db')
conn.row_factory = lambda cursor, row: row[0]
c = conn.cursor()
ids = c.execute('SELECT id FROM users').fetchall()
This yields something like:
[1, 2, 3, 4, 5, 6] # etc.
You can also set the row_factory
directly on the cursor object itself. Indeed, if you do not set the row_factory
on the connection before you create the cursor, you must set the row_factory
on the cursor:
c = conn.cursor()
c.row_factory = lambda cursor, row: {'foo': row[0]}
You may redefine the row_factory
at any point during the lifetime of the cursor object, and you can unset the row factory with None
to return default tuple-based results:
c.row_factory = None
c.execute('SELECT id FROM users').fetchall() # [(1,), (2,), (3,)] etc.
data=cursor.fetchall()
COLUMN = 0
column=[elt[COLUMN] for elt in data]
(My previous suggestion, column=zip(*data)[COLUMN]
, raises an IndexError
if data
is an empty tuple. In contrast, the list comprehension above just creates an empty list. Depending on your situation, raising an IndexError
may be preferable, but I'll leave that to you to decide.)
You don't really want to do this - anything you do along the lines of using zip or a list comprehension is just eating CPU cycles and sucking memory without adding significant value. You are far better served just dealing with the tuples.
As for why it returns tuples, it's because that is what the Python DBD API 2.0 requires from fetchall
.
I use the module pandas to deal with table-like content:
df = pd.DataFrame(cursor.fetchall(), columns=['one','two'])
The list of values for column 'one' is simply reffered as:
df['one'].values
You even can use you own index for the data referencing:
df0 = pd.DataFrame.from_records(cursor.fetchall(), columns=['Time','Serie1','Serie2'],index='Time')