Sqlite group_concat ordering

In Sqlite I can use group_concat to do:

1...A
1...B
1...C
2...A
2...B
2...C

1...C,B,A
2...C,B,A

but the order of the concatenation is random - according to docs.

I need to sort the output of group_concat to be

1...A,B,C
2...A,B,C

How can I do this?


Solution 1:

Can you not use a subselect with the order by clause in, and then group concat the values?

Something like

SELECT ID, GROUP_CONCAT(Val)
FROM (
   SELECT ID, Val
   FROM YourTable
   ORDER BY ID, Val
   )
GROUP BY ID;

Solution 2:

To be more precise, according to the docs:

The order of the concatenated elements is arbitrary.

It does not really mean random, it just means that the developers reserve the right to use whatever ordering they whish, even different ones for different queries or in different SQLite versions.

With the current version, this ordering might be the one implied by Adrian Stander's answer, as his code does seem to work. So you might just guard yourself with some unit tests and call it a day. But without examining the source code of SQLite really closely you can never be 100% sure this will always work.

If you are willing to build SQLite from source, you can also try to write your own user-defined aggregate function, but there is an easier way.

Fortunately, since version 3.25.0, you have window functions, providing a guaranteed-to-work, although somewhat ugly solution to your problem.

As you can see in the documentation, window functions have their own ORDER BY clauses:

In the example above, the window frame consists of all rows between the previous row ("1 PRECEDING") and the following row ("1 FOLLOWING"), inclusive, where rows are sorted according to the ORDER BY clause in the window-defn (in this case "ORDER BY a").

Note, that this alone would not necessarily mean that all aggregate functions respect the ordering inside a window frame, but if you take a look at the unit tests, you can see this is actually the case:

do_execsql_test 4.10.1 {
  SELECT a, 
    count() OVER (ORDER BY a DESC),
    group_concat(a, '.') OVER (ORDER BY a DESC) 
  FROM t2 ORDER BY a DESC
} {
  6 1 6
  5 2 6.5
  4 3 6.5.4
  3 4 6.5.4.3
  2 5 6.5.4.3.2
  1 6 6.5.4.3.2.1
  0 7 6.5.4.3.2.1.0
}

So, to sum it up, you can write

SELECT ID, GROUP_CONCAT(Val) OVER (PARTITION BY ID ORDER BY Val) FROM YourTable;

resulting in:

1|A
1|A,B
1|A,B,C
2|A
2|A,B
2|A,B,C

Which unfortunately also contains every prefix of your desired aggregations. Instead you want to specify the window frames to always contain the full range, then discard the redundant values, like this:

SELECT DISTINCT ID, GROUP_CONCAT(Val)
OVER (PARTITION BY ID ORDER BY Val ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM YourTable;

or like this:

SELECT * FROM (
    SELECT ID, GROUP_CONCAT(Val)
    OVER (PARTITION BY ID ORDER BY Val ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
    FROM YourTable
)
GROUP BY ID;