Postgres: INSERT if does not exist already

I'm using Python to write to a postgres database:

sql_string = "INSERT INTO hundred (name,name_slug,status) VALUES ("
sql_string += hundred + ", '" + hundred_slug + "', " + status + ");"
cursor.execute(sql_string)

But because some of my rows are identical, I get the following error:

psycopg2.IntegrityError: duplicate key value  
  violates unique constraint "hundred_pkey"

How can I write an 'INSERT unless this row already exists' SQL statement?

I've seen complex statements like this recommended:

IF EXISTS (SELECT * FROM invoices WHERE invoiceid = '12345')
UPDATE invoices SET billed = 'TRUE' WHERE invoiceid = '12345'
ELSE
INSERT INTO invoices (invoiceid, billed) VALUES ('12345', 'TRUE')
END IF

But firstly, is this overkill for what I need, and secondly, how can I execute one of those as a simple string?


Postgres 9.5 (released since 2016-01-07) offers an "upsert" command, also known as an ON CONFLICT clause to INSERT:

INSERT ... ON CONFLICT DO NOTHING/UPDATE

It solves many of the subtle problems you can run into when using concurrent operation, which some other answers propose.


How can I write an 'INSERT unless this row already exists' SQL statement?

There is a nice way of doing conditional INSERT in PostgreSQL:

INSERT INTO example_table
    (id, name)
SELECT 1, 'John'
WHERE
    NOT EXISTS (
        SELECT id FROM example_table WHERE id = 1
    );

CAVEAT This approach is not 100% reliable for concurrent write operations, though. There is a very tiny race condition between the SELECT in the NOT EXISTS anti-semi-join and the INSERT itself. It can fail under such conditions.


One approach would be to create a non-constrained (no unique indexes) table to insert all your data into and do a select distinct from that to do your insert into your hundred table.

So high level would be. I assume all three columns are distinct in my example so for step3 change the NOT EXITS join to only join on the unique columns in the hundred table.

  1. Create temporary table. See docs here.

    CREATE TEMPORARY TABLE temp_data(name, name_slug, status);
    
  2. INSERT Data into temp table.

    INSERT INTO temp_data(name, name_slug, status); 
    
  3. Add any indexes to the temp table.

  4. Do main table insert.

    INSERT INTO hundred(name, name_slug, status) 
        SELECT DISTINCT name, name_slug, status
        FROM hundred
        WHERE NOT EXISTS (
            SELECT 'X' 
            FROM temp_data
            WHERE 
                temp_data.name          = hundred.name
                AND temp_data.name_slug = hundred.name_slug
                AND temp_data.status    = status
        );
    

Unfortunately, PostgreSQL supports neither MERGE nor ON DUPLICATE KEY UPDATE, so you'll have to do it in two statements:

UPDATE  invoices
SET     billed = 'TRUE'
WHERE   invoices = '12345'

INSERT
INTO    invoices (invoiceid, billed)
SELECT  '12345', 'TRUE'
WHERE   '12345' NOT IN
        (
        SELECT  invoiceid
        FROM    invoices
        )

You can wrap it into a function:

CREATE OR REPLACE FUNCTION fn_upd_invoices(id VARCHAR(32), billed VARCHAR(32))
RETURNS VOID
AS
$$
        UPDATE  invoices
        SET     billed = $2
        WHERE   invoices = $1;

        INSERT
        INTO    invoices (invoiceid, billed)
        SELECT  $1, $2
        WHERE   $1 NOT IN
                (
                SELECT  invoiceid
                FROM    invoices
                );
$$
LANGUAGE 'sql';

and just call it:

SELECT  fn_upd_invoices('12345', 'TRUE')