Eliminating duplicate values based on only one column of the table
My query:
SELECT sites.siteName, sites.siteIP, history.date
FROM sites INNER JOIN
history ON sites.siteName = history.siteName
ORDER BY siteName,date
First part of the output:
How can I remove the duplicates in siteName
column? I want to leave only the updated one based on date
column.
In the example output above, I need the rows 1, 3, 6, 10
This is where the window function row_number()
comes in handy:
SELECT s.siteName, s.siteIP, h.date
FROM sites s INNER JOIN
(select h.*, row_number() over (partition by siteName order by date desc) as seqnum
from history h
) h
ON s.siteName = h.siteName and seqnum = 1
ORDER BY s.siteName, h.date
From your example it seems reasonable to assume that the siteIP
column is determined by the siteName
column (that is, each site has only one siteIP
). If this is indeed the case, then there is a simple solution using group by
:
select
sites.siteName,
sites.siteIP,
max(history.date)
from sites
inner join history on
sites.siteName=history.siteName
group by
sites.siteName,
sites.siteIP
order by
sites.siteName;
However, if my assumption is not correct (that is, it is possible for a site to have multiple siteIP
), then it is not clear from you question which siteIP
you want the query to return in the second column. If just any siteIP
, then the following query will do:
select
sites.siteName,
min(sites.siteIP),
max(history.date)
from sites
inner join history on
sites.siteName=history.siteName
group by
sites.siteName
order by
sites.siteName;