Normalize array subscripts for 1-dimensional array so they start with 1
PostgreSQL can work with array subscripts starting anywhere.
Consider this example that creates an array with 3 elements with subscripts from 5 to 7:
SELECT ('[5:7]={1,2,3}'::int[]);
Returns:
[5:7]={1,2,3}
Meaning, for instance, that you get the first element with
SELECT ('[5:7]={1,2,3}'::int[])[5];
I want to normalize any given 1-dimensional array to start with array subscript 1.
The best I could come up with:
SELECT ('[5:7]={1,2,3}'::int[])[array_lower('[5:7]={1,2,3}'::int[], 1):array_upper('[5:7]={1,2,3}'::int[], 1)]
Or, the same, easier the read:
WITH x(a) AS (
SELECT '[5:7]={1,2,3}'::int[]
)
SELECT a[array_lower(a, 1):array_upper(a, 1)]
FROM x
Do you know a simpler / faster or at least more elegant way?
Benchmark
For the purpose of testing performance I whipped up this quick benchmark.
Table with 100k rows, simple integer array of random length between 1 an 11:
CREATE TEMP TABLE t (a int[]);
INSERT INTO t -- now with actually varying subscripts
SELECT ('[' || g%10 || ':' || 2*(g%10) || ']={1'
|| repeat(','||g::text, g%10) || '}')::int[]
FROM generate_series(1,100000) g;
EXPLAIN ANALYZE
SELECT
substring(a::text, '{.*$')::int[] -- Total runtime: 949.304 ms
-- a[-2147483648:2147483647] -- Total runtime: 283.877 ms
-- a[array_lower(a, 1):array_upper(a, 1)] -- Total runtime: 311.545 ms
FROM t
So, yes, @Daniel's idea is slightly faster.
@Kevin's text conversion works, too, but doesn't earn many points.
Any other ideas?
Solution 1:
There is a simpler method that is ugly, but I believe technically correct: extract the largest possible slice out of the array, as opposed to the exact slice with computed bounds. It avoids the two function calls.
Example:
select ('[5:7]={1,2,3}'::int[])[-2147483648:2147483647];
results in:
int4 --------- {1,2,3}
Solution 2:
Eventually, something more elegant popped up with Postgres 9.6. The manual:
It is possible to omit the
lower-bound
and/orupper-bound
of a slice specifier; the missing bound is replaced by the lower or upper limit of the array's subscripts. For example:
So it's simple now:
SELECT my_arr[:];
With my example array literal you need enclosing parentheses to make the syntax unambiguous::
SELECT ('[5:7]={1,2,3}'::int[])[:];
About the same performance as Daniel's solution with hard-coded max array subscripts - which is still the way to go with Postgres 9.5 or earlier.
Solution 3:
Not sure if this is already covered, but:
SELECT array_agg(v) FROM unnest('[5:7]={1,2,3}'::int[]) AS a(v);
To test performance I had to add id
column on the test table. Slow.