Return Pandas dataframe from PostgreSQL query with sqlalchemy
I want to query a PostgreSQL database and return the output as a Pandas dataframe.
I created a connection to the database with 'SqlAlchemy':
from sqlalchemy import create_engine
engine = create_engine('postgresql://user@localhost:5432/mydb')
I write a Pandas dataframe to a database table:
i=pd.read_csv(path)
i.to_sql('Stat_Table',engine,if_exists='replace')
Based on the docs, looks like pd.read_sql_query() should accept a SQLAlchemy engine:
a=pd.read_sql_query('select * from Stat_Table',con=engine)
But it throws an error:
ProgrammingError: (ProgrammingError) relation "stat_table" does not exist
I'm using Pandas version 0.14.1.
What's the right way to do this?
Solution 1:
You are bitten by the case (in)sensitivity issues with PostgreSQL. If you quote the table name in the query, it will work:
df = pd.read_sql_query('select * from "Stat_Table"',con=engine)
But personally, I would advise to just always use lower case table names (and column names), also when writing the table to the database to prevent such issues.
From the PostgreSQL docs (http://www.postgresql.org/docs/8.0/static/sql-syntax.html#SQL-SYNTAX-IDENTIFIERS):
Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case
To explain a bit more: you have written a table with the name Stat_Table
to the database (and sqlalchemy will quote this name, so it will be written as "Stat_Table" in the postgres database). When doing the query 'select * from Stat_Table'
the unquoted table name will be converted to lower case stat_table
, and so you get the message that this table is not found.
See eg also Are PostgreSQL column names case-sensitive?
Solution 2:
Read postgres sql data in pandas in given below and image link
import psycopg2 as pg
import pandas.io.sql as psql
connection = pg.connect("host=localhost dbname=kinder user=your_username password=your_password")
dataframe = psql.read_sql('SELECT * FROM product_product', connection)
product_category = psql.read_sql_query('select * from product_category', connection)
https://i.stack.imgur.com/1bege.png
Solution 3:
Late to the party here, but to give you a full example of this:
import pandas as pd
import psycopg2 as pg
engine = pg.connect("dbname='my_db_name' user='pguser' host='127.0.0.1' port='15432' password='pgpassword'")
df = pd.read_sql('select * from Stat_Table', con=engine)
You need to run the following to install the dependencies for ubuntu:
pip install pandas psycopg2-binary SQLAlchemy
Pandas docs on the subject here
Solution 4:
The error message is telling you that a table named:
stat_table
does not exist( a relation is a table in postgres speak). So, of course you can't select rows from it. Check your db after executing:
i.to_sql('Stat_Table',engine,if_exists='replace')
and see if a table by that name got created in your db.
When I use your read statement:
df = pd.read_sql_query('select * from Stat_Table',con=engine)
I get the data back from a postgres db, so there's nothing wrong with it.