Counting the number of occurrences of a substring within a string in PostgreSQL
How can I count the number of occurrences of a substring within a string in PostgreSQL?
Example:
I have a table
CREATE TABLE test."user"
(
uid integer NOT NULL,
name text,
result integer,
CONSTRAINT pkey PRIMARY KEY (uid)
)
I want to write a query so that the result
contains column how many occurrences of the substring o
the column name
contains. For instance, if in one row, name
is hello world
, the column result
should contain 2
, since there are two o
in the string hello world
.
In other words, I'm trying to write a query that would take as input:
and update the result
column:
I am aware of the function regexp_matches
and its g
option, which indicates that the full (g
= global) string needs to be scanned for the presence of all occurrences of the substring).
Example:
SELECT * FROM regexp_matches('hello world', 'o', 'g');
returns
{o}
{o}
and
SELECT COUNT(*) FROM regexp_matches('hello world', 'o', 'g');
returns
2
But I don't see how to write an UPDATE
query that would update the result
column in such a way that it would contain how many occurrences of the substring o the column name
contains.
Solution 1:
A common solution is based on this logic: replace the search string with an empty string and divide the difference between old and new length by the length of the search string
(CHAR_LENGTH(name) - CHAR_LENGTH(REPLACE(name, 'substring', '')))
/ CHAR_LENGTH('substring')
Hence:
UPDATE test."user"
SET result =
(CHAR_LENGTH(name) - CHAR_LENGTH(REPLACE(name, 'o', '')))
/ CHAR_LENGTH('o');
Solution 2:
A Postgres'y way of doing this converts the string to an array and counts the length of the array (and then subtracts 1):
select array_length(string_to_array(name, 'o'), 1) - 1
Note that this works with longer substrings as well.
Hence:
update test."user"
set result = array_length(string_to_array(name, 'o'), 1) - 1;
Solution 3:
Other way:
UPDATE test."user" SET result = length(regexp_replace(name, '[^o]', '', 'g'));
Solution 4:
Return count of character,
SELECT (LENGTH('1.1.1.1') - LENGTH(REPLACE('1.1.1.1','.',''))) AS count
--RETURN COUNT OF CHARACTER '.'