Inserting text string with hex into PostgreSQL as a bytea
I have a text file with several strings of hex in it:
013d7d16d7ad4fefb61bd95b765c8ceb
007687fc64b746569616414b78c81ef1
I would like to store these in the database as a bytea, instead of a varchar. That is, I would like the database to store 01 as the single byte 00000001, not characters '0' & '1'.
I can easily run this file through sed to format/escape it any way I need to.
This is what I have tried:
create table mytable (testcol BYTEA);
This works:
insert into mytable (testcol) values (E'\x7f\x7f');
However, as soon as I have a byte that goes above \x7f, I get this error:
insert into mytable (testcol) values (E'\x7f\x80');
ERROR: invalid byte sequence for encoding "UTF8": 0x80
Any ideas, or am I approaching things wrong?
You can convert a hex string to bytea using the decode
function (where "encoding" means encoding a binary value to some textual value). For example:
select decode('DEADBEEF', 'hex');
decode
------------------
\336\255\276\357
which is more understandable with 9.0's default output:
decode
------------
\xdeadbeef
The reason you can't just say E'\xDE\xAD\xBE\xEF'
is that this is intended to make a text value, not a bytea, so Postgresql will try to convert it from the client encoding to the database encoding. You could write the bytea escape format like that, but you need to double the backslashes: E'\\336\\255\\276\\357'::bytea
. I think you can see why the bytea format is being changed.... IMHO the decode()
function is a reasonable way of writing inputs, even though there is some overhead involved.
INSERT INTO mytable (testcol) VALUES (decode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'))
The Ruby Way
I recently needed to read/write binary data from/to Postgres, but via Ruby. Here's how I did it using the Pg library.
Although not strictly Postgres-specific, I thought I'd include this Ruby-centric answer for reference.
Postgres DB Setup
require 'pg'
DB = PG::Connection.new(host: 'localhost', dbname:'test')
DB.exec "CREATE TABLE mytable (testcol BYTEA)"
BINARY = 1
Insert Binary Data
sql = "INSERT INTO mytable (testcol) VALUES ($1)"
param = {value: binary_data, format: BINARY}
DB.exec_params(sql, [param]) {|res| res.cmd_tuples == 1 }
Select Binary Data
sql = "SELECT testcol FROM mytable LIMIT 1"
DB.exec_params(sql, [], BINARY) {|res| res.getvalue(0,0) }
Introduction
This is an updated answer that includes both how to insert but also how to query.
It is possible to convert the hex into a bytea value using the decode
function. This should be used for both querying and also inserting.
This can be used for both inserting but also querying.
Example SQL Fiddle
Querying Existing Data
SELECT * FROM mytable WHERE testcol = (decode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'));
Encode vs Decode for Querying
A user had asked the following:
How does searching the bytea field by hex value after inserting it?
SELECT * FROM my_table WHERE myHexField = (encode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'));
does not work.
In the documentation Binary String Functions and Operators, they have the description of both encode
and decode
.
+==================================+=============+=======================================================================================================+=======================================+============+
| Function | Return Type | Description | Example | Result |
+==================================+=============+=======================================================================================================+=======================================+============+
| decode(string text, format text) | bytea | Decode binary data from textual representation in string. Options for format are same as in encode. | decode('123\000456', 'escape') | 123\000456 |
+----------------------------------+-------------+-------------------------------------------------------------------------------------------------------+---------------------------------------+------------+
| encode(data bytea, format text) | text | Encode binary data into a textual representation. Supported formats are: base64, hex, escape. escape | encode('123\000456'::bytea, 'escape') | 123\000456 |
| | | converts zero bytes and high-bit-set bytes to octal sequences (\nnn) and doubles backslashes. | | |
+----------------------------------+-------------+-------------------------------------------------------------------------------------------------------+---------------------------------------+------------+
So you will notice that Encode
is for encoding binary data into a textual string
and returns text. However, since we are storing bytea
we have to use decode
for both inserting and querying.
Inserting
create table mytable (testcol BYTEA);
INSERT INTO
mytable (testcol)
VALUES
(decode('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'));
From: see previous answer