Function executes faster without STRICT modifier?
I stumbled upon a slump in performance when a simple SQL function is declared STRICT
while answering this question.
For demonstration, I created two variants of a function ordering two elements of an array in ascending order.
Test setup
Table with 10000 random pairs of integer (
CREATE TABLE tbl (arr int[]);
INSERT INTO tbl
SELECT ARRAY[(random() * 1000)::int, (random() * 1000)::int]
FROM generate_series(1,10000);
Function without STRICT
modifier:
CREATE OR REPLACE FUNCTION f_sort_array(int[])
RETURNS int[]
LANGUAGE sql IMMUTABLE AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;
Function with STRICT
modifier (otherwise identical):
CREATE OR REPLACE FUNCTION f_sort_array_strict(int[])
RETURNS int[]
LANGUAGE sql IMMUTABLE STRICT AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;
Results
I executed each around 20 times and took the best result from EXPLAIN ANALYZE
.
SELECT f_sort_array(arr) FROM tbl; -- Total runtime: 43 ms
SELECT f_sort_array_strict(arr) FROM tbl; -- Total runtime: 103 ms
These are the results from Postgres 9.0.5 on Debian Squeeze. Similar results on 8.4.
In a test with all NULL values both functions perform the same: ~37 ms.
I did some research and found an interesting gotcha. Declaring an SQL function STRICT disables function-inlining in most cases. More about that in the PostgreSQL Online Journal or in the pgsql-performance mailing list or in the Postgres Wiki.
But I am not quite sure how this could be the explanation. Not inlining the function causes a performance slump in this simple scenario? No index, no disc read, no sorting. Maybe an overhead from the repeated function call that is streamlined away by inlining the function?
Retests
Same test, same hardware, Postgres 9.1. Even bigger differences:
SELECT f_sort_array(arr) FROM tbl; -- Total runtime: 27 ms
SELECT f_sort_array_strict(arr) FROM tbl; -- Total runtime: 107 ms
Same test, new hardware, Postgres 9.6. The gap is even bigger, yet:
SELECT f_sort_array(arr) FROM tbl; -- Total runtime: 10 ms
SELECT f_sort_array_strict(arr) FROM tbl; -- Total runtime: 60 ms
Solution 1:
Maybe an overhead from the repeated function call that is streamlined away by inlining the function?
That's what I'd guess. You've got a very simple expression there. An actual function-call presumably involves stack setup, passing parameters etc.
The test below gives run-times of 5ms for inlined and 50ms for strict.
BEGIN;
CREATE SCHEMA f;
SET search_path = f;
CREATE FUNCTION f1(int) RETURNS int AS $$SELECT 1$$ LANGUAGE SQL;
CREATE FUNCTION f2(int) RETURNS int AS $$SELECT 1$$ LANGUAGE SQL STRICT;
\timing on
SELECT sum(f1(i)) FROM generate_series(1,10000) i;
SELECT sum(f2(i)) FROM generate_series(1,10000) i;
\timing off
ROLLBACK;