Mysql count instances of substring, then order by

I have a problem in mySQL that goes as follows:

  • Count the instances of a substring in a string field in a mySQL database
  • Order the results by the number of occurrences of that substring (DESC)

I have never done anything other than rudimentary queries.. I can't find a solution elsewhere.


Solution 1:

SELECT (CHAR_LENGTH(str) - CHAR_LENGTH(REPLACE(str, substr, ''))) / CHAR_LENGTH(substr) AS cnt
...
ORDER BY cnt DESC

Yep, looks bloated but afaik there is no any other possible solution.

mysql> select (CHAR_LENGTH('asd') - CHAR_LENGTH(REPLACE('asd', 's', ''))) / CHAR_LENGTH('s');
+-----------------------------------------------------------------+
| (CHAR_LENGTH('asd') - CHAR_LENGTH(REPLACE('asd', 's', ''))) / CHAR_LENGTH('s') |
+-----------------------------------------------------------------+
|                                                          1.0000 |
+-----------------------------------------------------------------+
1 row in set (0.00 sec)



mysql> select host, (CHAR_LENGTH(host) - CHAR_LENGTH(REPLACE(host, 'l', ''))) / CHAR_LENGTH('l') AS cnt from user;
+-----------+--------+
| host      | cnt    |
+-----------+--------+
| 127.0.0.1 | 0.0000 |
| honeypot  | 0.0000 |
| honeypot  | 0.0000 |
| localhost | 2.0000 |
| localhost | 2.0000 |
+-----------+--------+
5 rows in set (0.00 sec)

Solution 2:

DELIMITER //
DROP FUNCTION IF EXISTS `subStringCount`//
CREATE FUNCTION `subStringCount` (sequence VARCHAR(1000), word VARCHAR(100)) RETURNS INT(4)
DETERMINISTIC
CONTAINS SQL
BEGIN
    DECLARE counter SMALLINT UNSIGNED DEFAULT 0;
    DECLARE word_length SMALLINT UNSIGNED;

    SET word_length = CHAR_LENGTH(word);

    WHILE (INSTR(sequence,word) != 0) DO
        SET counter = counter+1;
        SET sequence = SUBSTR(sequence, INSTR(sequence,word)+word_length);
    END WHILE; 

    RETURN counter;
END //
DELIMITER ;

Which can be executed by calling:

SELECT sum(subStringCount(fieldName,'subString')) FROM  `table` WHERE 1