Joining Results from Two Separate Databases
Is it possible to JOIN
rows from two separate postgres databases?
I am working with system with couple databases in one server and sometimes I really need such a feature.
According to http://wiki.postgresql.org/wiki/FAQ
There is no way to query a database other than the current one. Because PostgreSQL loads database-specific system catalogs, it is uncertain how a cross-database query should even behave. contrib/dblink allows cross-database queries using function calls. Of course, a client can also make simultaneous connections to different databases and merge the results on the client side.
EDIT: 3 years later (march 2014), this FAQ entry has been revised and is more helpful:
How do I perform queries using multiple databases?
There is no way to directly query a database other than the current one. Because PostgreSQL loads database-specific system catalogs, it is uncertain how a cross-database query should even behave.
The SQL/MED support in PostgreSQL allows a "foreign data wrapper" to be created, linking tables in a remote database to the local database. The remote database might be another database on the same PostgreSQL instance, or a database half way around the world, it doesn't matter. postgres_fdw is built-in to PostgreSQL 9.3 and includes read/write support; a read-only version for 9.2 can be compiled and installed as a contrib module.
contrib/dblink allows cross-database queries using function calls and is available for much older PostgreSQL versions. Unlike postgres_fdw it can't "push down" conditions to the remote server, so it'll often land up fetching a lot more data than you need.
Of course, a client can also make simultaneous connections to different databases and merge the results on the client side.
Forget about dblink!
Say hello to Postgres_FDW:
To prepare for remote access using
postgres_fdw
:
Install the
postgres_fdw
extension usingCREATE EXTENSION
.Create a foreign server object, using
CREATE SERVER
, to represent each remote database you want to connect to. Specify connection information, except user, and password, as options of the server object.Create a user mapping, using
CREATE USER MAPPING
, for each database user you want to allow to access each foreign server. Specify the remote user name and password to use as user and password options of the user mapping.Create a foreign table, using
CREATE FOREIGN TABLE
orIMPORT FOREIGN SCHEMA
, for each remote table you want to access. The columns of the foreign table must match the referenced remote table. You can, however, use table and/or column names different from the remote table's, if you specify the correct remote names as options of the foreign table object.Now you need only
SELECT
from a foreign table to access the data stored in its underlying remote table.
It's really useful even on large data.
Yes, it is possible to do this using dblink
albeit with significant performance considerations.
The following example will require the current SQL user to have permissions on both databases. If db2
is not located on the same cluster, then you will need to replace dbname=db2
with the full connection string defined in the dblink documentation.
SELECT *
FROM table1 tb1
LEFT JOIN (
SELECT *
FROM dblink('dbname=db2','SELECT id, code FROM table2')
AS tb2(id int, code text);
) AS tb2 ON tb2.column = tb1.column;
If table2
is very large, you could have performance issues because the sub-query loads up the entire table2
before performing the join.