How to estimate the size of one column in a Postgres table?
There is a column of type text
in a table in Postgres 9.1. I'd like to know the impact of just that column on the disk space needed. It doesn't need to be precise, but I'd like to get an idea if that column is responsible for 20%/30%/... of the disk space consumed by the database.
I know pg_relation_size
, but it only operates at table level.
I have many databases with this same schema. I dumped a smaller one and cut out the column with grep and cut and compared the size of the plain text dumps. But this is not necessarily a good indicator of space requirements in the live db, and it's also more difficult to do that for large databases.
Solution 1:
select
sum(pg_column_size(the_text_column)) as total_size,
avg(pg_column_size(the_text_column)) as average_size,
sum(pg_column_size(the_text_column)) * 100.0 / pg_relation_size('t') as percentage
from t;
Solution 2:
Slight improvement on the accepted answer: pretty print the size and use pg_total_relation_size to be more accurate.
select
pg_size_pretty(sum(pg_column_size(column_name))) as total_size,
pg_size_pretty(avg(pg_column_size(column_name))) as average_size,
sum(pg_column_size(column_name)) * 100.0 / pg_total_relation_size('table_name') as percentage
from table_name;
Solution 3:
If you want a report for all the columns in a database sorted by size then here is the way
BEGIN;
CREATE FUNCTION tc_column_size(table_name text, column_name text)
RETURNS BIGINT AS
$$
declare response BIGINT;
BEGIN
EXECUTE 'select sum(pg_column_size(t."' || column_name || '")) from ' || table_name || ' t ' into response;
return response;
END;
$$
LANGUAGE plpgsql;
SELECT
z.table_name,
z.column_name,
pg_size_pretty(z.size)
FROM (
SELECT
table_name,
column_name,
tc_column_size(table_name, column_name) size
FROM
information_schema.columns
WHERE
table_schema = 'public') AS z
WHERE
size IS NOT NULL
-- and z.table_name = 'my_table' -- <--- uncomment to filter a table
ORDER BY
z.size DESC;
ROLLBACK; -- <--- You may not want to keep that function