How to remove carriage returns and new lines in Postgresql?
All,
I am stuck again trying to get my data in a format that I need it in. I have a text field that looks like this.
"deangelo 001 deangelo
local origin of name: italain
from the american name deangelo
meaning: of the angels
emotional spectrum • he is a fountain of joy for all.
personal integrity • his good name is his most precious asset. personality • it’s hard to soar with eagles when you’re surrounded by turkeys! relationships • starts slowly, but a relationship with deangelo builds over time. travel & leisure • a trip of a lifetime is in his future.
career & money • a gifted child, deangelo will need to be challenged constantly.
life’s opportunities • joy and happiness await this blessed person.
deangelo’s lucky numbers: 12 • 38 • 18 • 34 • 29 • 16
"
What would the best way be in Postgresql to remove the carriage returns and new lines? I've tried several things and none of them want to behave.
select regexp_replace(field, E'\r\c', ' ', 'g') from mytable
WHERE id = 5520805582
SELECT regexp_replace(field, E'[^\(\)\&\/,;\*\:.\>\<[:space:]a-zA-Z0-9-]', ' ')
FROM mytable
WHERE field~ E'[^\(\)\&\/,;\*\:.\<\>[:space:]a-zA-Z0-9-]'
AND id = 5520805582;
Thanks in advance, Adam
select regexp_replace(field, E'[\\n\\r]+', ' ', 'g' )
read the manual http://www.postgresql.org/docs/current/static/functions-matching.html
select regexp_replace(field, E'[\\n\\r\\u2028]+', ' ', 'g' )
I had the same problem in my postgres d/b, but the newline in question wasn't the traditional ascii CRLF, it was a unicode line separator, character U2028. The above code snippet will capture that unicode variation as well.
Update... although I've only ever encountered the aforementioned characters "in the wild", to follow lmichelbacher's advice to translate even more unicode newline-like characters, use this:
select regexp_replace(field, E'[\\n\\r\\f\\u000B\\u0085\\u2028\\u2029]+', ' ', 'g' )
OP asked specifically about regexes since it would appear there's concern for a number of other characters as well as newlines, but for those just wanting strip out newlines, you don't even need to go to a regex. You can simply do:
select replace(field,E'\n','');
I think this is an SQL-standard behavior, so it should extend back to all but perhaps the very earliest versions of Postgres. The above tested fine for me in 9.4 and 9.2