Inserting text string with hex into PostgreSQL as a bytea

I have a text file with several strings of hex in it:

013d7d16d7ad4fefb61bd95b765c8ceb
007687fc64b746569616414b78c81ef1

I would like to store these in the database as a bytea, instead of a varchar. That is, I would like the database to store 01 as the single byte 00000001, not characters '0' & '1'.

I can easily run this file through sed to format/escape it any way I need to.

This is what I have tried:

create table mytable (testcol BYTEA);

This works:

insert into mytable (testcol) values (E'\x7f\x7f');

However, as soon as I have a byte that goes above \x7f, I get this error:

insert into mytable (testcol) values (E'\x7f\x80');
ERROR:  invalid byte sequence for encoding "UTF8": 0x80

Any ideas, or am I approaching things wrong?


You can convert a hex string to bytea using the decode function (where "encoding" means encoding a binary value to some textual value). For example:

select decode('DEADBEEF', 'hex');
      decode      
------------------
 \336\255\276\357

which is more understandable with 9.0's default output:

   decode   
------------
 \xdeadbeef

The reason you can't just say E'\xDE\xAD\xBE\xEF' is that this is intended to make a text value, not a bytea, so Postgresql will try to convert it from the client encoding to the database encoding. You could write the bytea escape format like that, but you need to double the backslashes: E'\\336\\255\\276\\357'::bytea. I think you can see why the bytea format is being changed.... IMHO the decode() function is a reasonable way of writing inputs, even though there is some overhead involved.


INSERT INTO
  mytable (testcol)
VALUES
  (decode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'))

The Ruby Way

I recently needed to read/write binary data from/to Postgres, but via Ruby. Here's how I did it using the Pg library.

Although not strictly Postgres-specific, I thought I'd include this Ruby-centric answer for reference.

Postgres DB Setup

require 'pg'
DB = PG::Connection.new(host: 'localhost', dbname:'test')
DB.exec "CREATE TABLE mytable (testcol BYTEA)"
BINARY = 1

Insert Binary Data

sql = "INSERT INTO mytable (testcol) VALUES ($1)"
param = {value: binary_data, format: BINARY}
DB.exec_params(sql, [param]) {|res| res.cmd_tuples == 1 }

Select Binary Data

sql = "SELECT testcol FROM mytable LIMIT 1"
DB.exec_params(sql, [], BINARY) {|res| res.getvalue(0,0) }

Introduction

This is an updated answer that includes both how to insert but also how to query.

It is possible to convert the hex into a bytea value using the decode function. This should be used for both querying and also inserting.

This can be used for both inserting but also querying.

Example SQL Fiddle

Querying Existing Data

SELECT * FROM mytable WHERE testcol = (decode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'));

Encode vs Decode for Querying

A user had asked the following:

How does searching the bytea field by hex value after inserting it?

SELECT * FROM my_table WHERE myHexField =
(encode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'));

does not work.

In the documentation Binary String Functions and Operators, they have the description of both encode and decode.

   +==================================+=============+=======================================================================================================+=======================================+============+
    |             Function             | Return Type |                                              Description                                              |                Example                |   Result   |
    +==================================+=============+=======================================================================================================+=======================================+============+
    | decode(string text, format text) | bytea       | Decode binary data from textual representation in string. Options for format are same as in encode.   | decode('123\000456', 'escape')        | 123\000456 |
    +----------------------------------+-------------+-------------------------------------------------------------------------------------------------------+---------------------------------------+------------+
    | encode(data bytea, format text)  | text        | Encode binary data into a textual representation. Supported formats are: base64, hex, escape. escape  | encode('123\000456'::bytea, 'escape') | 123\000456 |
    |                                  |             | converts zero bytes and high-bit-set bytes to octal sequences (\nnn) and doubles backslashes.         |                                       |            |
    +----------------------------------+-------------+-------------------------------------------------------------------------------------------------------+---------------------------------------+------------+

So you will notice that Encode is for encoding binary data into a textual string and returns text. However, since we are storing bytea we have to use decode for both inserting and querying.

Inserting

create table mytable (testcol BYTEA);

INSERT INTO
  mytable (testcol)
VALUES
  (decode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'));

From: see previous answer