python adds "E" to string

This string:

"CREATE USER %s PASSWORD %s", (user, pw)

always gets expanded to:

CREATE USER E'someuser' PASSWORD E'somepassword'

Can anyone tell me why?

Edit: The expanded string above is the string my database gives me back in the error message. I'm using psycopg2 to access my postgres database. The real code looks like this:

conn=psycopg2.connect(user=adminuser, password=adminpass, host=host)
cur = conn.cursor()

#user and pw are simple standard python strings the function gets as parameter
cur.execute("CREATE USER %s PASSWORD %s", (user, pw))
conn.commit()

Solution 1:

To pass identifiers to postgresql through psycopg use AsIs from the extensions module

from psycopg2.extensions import AsIs
import psycopg2
connection = psycopg2.connect(database='db', user='user')
cur = connection.cursor()
cur.mogrify(
    'CREATE USER %s PASSWORD %s', (AsIs('someuser'), AsIs('somepassword'))
    )
'CREATE USER someuser PASSWORD somepassword'

That works also for passing conditions to clauses like order by:

cur.mogrify(
    'select * from t order by %s', (AsIs('some_column, another column desc'),)
    )
'select * from t order by some_column, another column desc'

Solution 2:

As the OP's edit reveals he's using PostgreSQL, the docs for it are relevant, and they say:

PostgreSQL also accepts "escape" string constants, which are an extension to the SQL standard. An escape string constant is specified by writing the letter E (upper or lower case) just before the opening single quote, e.g. E'foo'.

In other words, psycopg is correctly generating escape string constants for your strings (so that, as the docs also say:

Within an escape string, a backslash character () begins a C-like backslash escape sequence, in which the combination of backslash and following character(s) represents a special byte value.

(which as it happens are also the escape conventions of non-raw Python string literals).

The OP's error clearly has nothing to do with that, and, besides the excellent idea of studying PostgreSQL's excellent docs, he should not worry about that E'...' form in this case;-).

Solution 3:

Not only the E but the quotes appear to come from whatever type user and pw have. %s simply does what str() does, which may fall back to repr(), both of which have corresponding methods __str__ and __repr__. Also, that isn't the code that generates your result (I'd assumed there was a %, but now see only a comma). Please expand your question with actual code, types and values.

Addendum: Considering that it looks like SQL, I'd hazard a guess that you're seeing escape string constants, likely properly generated by your database interface module or library.

Solution 4:

Before attempting something like:

statement = "CREATE USER %s PASSWORD %s" % (user, pw)

Please ensure you read: http://www.initd.org/psycopg/docs/usage.html

Basically the issue is that if you are accepting user input (I assume so as someone is entering in the user & pw) you are likely leaving yourself open to SQL injection.

As PsyCopg2 states:

Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.

As has been identified, Postgres (or Psycopg2) doesn't seem to provide a good answer to escaping identifiers. In my opinion, the best way to resolve this is to provide a 'whitelist' filtering method.

ie: Identify what characters are allowed in a 'user' and a 'pw'. (perhaps A-Za-z0-9_). Be careful that you don't include escape characters (' or ;, etc..) or if you do, that you escape these values.