Store common query as column?

Using PostgreSQL, I have a number of queries that look like this:

SELECT <col 1>, <col 2>
     , (SELECT sum(<col x>)
        FROM   <otherTable> 
        WHERE  <other table foreignkeyCol>=<this table keycol>) AS <col 3>
FROM   <tbl>

Given that the sub-select will be identical in every case, is there a way to store that sub-select as a pseudo-column in the table? Essentially, I want to be able to select a column from table A that is a sum of a specific column from table B where the records are related. Is this possible?


Is there a way to store that sub-select as a pseudo-column in the table?

A VIEW like has been advised is a perfectly valid solution. Go for it.

But there is another way that fits your question even more closely. You can write a function that takes the table type as parameter to emulate a "computed field" or "generated column".

Consider this test case, derived from your description:

CREATE TABLE tbl_a (a_id int, col1 int, col2 int);
INSERT INTO tbl_a VALUES (1,1,1), (2,2,2), (3,3,3), (4,4,4);

CREATE TABLE tbl_b (b_id int, a_id int, colx int);
INSERT INTO tbl_b VALUES
  (1,1,5),  (2,1,5),  (3,1,1)
, (4,2,8),  (5,2,8),  (6,2,6)
, (7,3,11), (8,3,11), (9,3,11);

Create function that emulates col3:

CREATE FUNCTION col3(tbl_a)
  RETURNS int8
  LANGUAGE sql STABLE AS
$func$
SELECT sum(colx)
FROM   tbl_b b
WHERE  b.a_id = $1.a_id
$func$;

Now you can query:

SELECT a_id, col1, col2, tbl_a.col3
FROM   tbl_a;

Or even:

SELECT *, a.col3 FROM tbl_a a;

Note how I wrote tbl_a.col3 / a.col3, not just col3. This is essential.

Unlike a "virtual column" in Oracle it is not included automatically in a SELECT * FROM tbl_a. You could use a VIEW for that.

Why does this work?

The common way to reference a table column is with attribute notation:

SELECT tbl_a.col1 FROM tbl_a;

The common way to call a function is with functional notation:

SELECT col3(tbl_a);

Generally, it's best to stick to these canonical ways, which agree with the SQL standard.

But Postgres also allows attribute notation. These work as well:

SELECT col1(tbl_a) FROM tbl_a;
SELECT tbl_a.col3;

More about that in the manual.
You probably see by now, where this is going. This looks like you would add an extra column of table tbl_a while col3() is actually a function that takes the current row of tbl_a (or its alias) as row type argument and computes a value.

SELECT *, a.col3
FROM   tbl_a AS a;

If there is an actual column col3 it takes priority and the system does not look for a function of that name taking the row tbl_a as parameter.

The "beauty" of it: you can add or drop columns from tbl_a and the last query will dynamically return all current columns, where a view would only return such columns that existed at creation time (early binding vs. late binding of *).
Of course, you have to drop the depending function before you can drop the table now. And you have to take care not to invalidate the function when making changes to the table.

I still wouldn't use it. It's too surprising to the innocent reader.


Apart from a view, you can create a function for the sum.

CREATE FUNCTION sum_other_table( key type_of_key ) RETURNS bigint
AS $$ SELECT sum( col_x ) FROM table_1 where table_1.key = key $$ LANGUAGE SQL;

and then use it as your aggregator:

SELECT col_1, col_2, sum_other_table( key ) AS col_3
FROM table_2 WHERE table_2.key = key;

Note that the return type of sum_other_table() depends on the type of the column you're summing up.


Apparently this is handled with views, as per lion's comment. So in my case, I used the command:

CREATE VIEW <viewname> AS
SELECT *, (SELECT sum(<col x>)
FROM   <otherTable
WHERE  <otherTable foreignkeyCol>=<thisTable keycol>) AS <col 3>
FROM   <tablename>

which essentially gives me another table including the desired column.


There are three answers so far, all of which work. Any one of them could be a "best solution" depending on circumstances. With small tables the performance should be pretty close, but none of them are likely to scale well to tables with millions of rows. The fastest way to get the desired results with a large data set would probably be (using Erwin's setup):

SELECT a_id, col1, col2, sum(colx)
FROM tbl_a LEFT JOIN tbl_b b using(a_id)
GROUP BY a_id, col1, col2;

If a_id is declared as a primary key, and this is run under 9.1 or later, the GROUP BY clause can be simplified because col1 and col2 are functionally dependent on a_id.

SELECT a_id, col1, col2, sum(colx)
FROM tbl_a LEFT JOIN tbl_b b using(a_id)
GROUP BY a_id;

The view could be defined this way and it would scale, but I don't think that all the same execution paths will be considered for the approaches using functions, so the fastest execution path might not be used.