ActiveRecord Query Union
I've written a couple of complex queries (at least to me) with Ruby on Rail's query interface:
watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})
Both of these queries work fine by themselves. Both return Post objects. I would like to combine these posts into a single ActiveRelation. Since there could be hundreds of thousands of posts at some point, this needs to be done at the database level. If it were a MySQL query, I could simply user the UNION
operator. Does anybody know if I can do something similar with RoR's query interface?
Solution 1:
Here's a quick little module I wrote that allows you to UNION multiple scopes. It also returns the results as an instance of ActiveRecord::Relation.
module ActiveRecord::UnionScope
def self.included(base)
base.send :extend, ClassMethods
end
module ClassMethods
def union_scope(*scopes)
id_column = "#{table_name}.id"
sub_query = scopes.map { |s| s.select(id_column).to_sql }.join(" UNION ")
where "#{id_column} IN (#{sub_query})"
end
end
end
Here's the gist: https://gist.github.com/tlowrimore/5162327
Edit:
As requested, here's an example of how UnionScope works:
class Property < ActiveRecord::Base
include ActiveRecord::UnionScope
# some silly, contrived scopes
scope :active_nearby, -> { where(active: true).where('distance <= 25') }
scope :inactive_distant, -> { where(active: false).where('distance >= 200') }
# A union of the aforementioned scopes
scope :active_near_and_inactive_distant, -> { union_scope(active_nearby, inactive_distant) }
end
Solution 2:
I also have encountered this problem, and now my go-to strategy is to generate SQL (by hand or using to_sql
on an existing scope) and then stick it in the from
clause. I can't guarantee it's any more efficient than your accepted method, but it's relatively easy on the eyes and gives you a normal ARel object back.
watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})
Post.from("(#{watched_news_posts.to_sql} UNION #{watched_topic_posts.to_sql}) AS posts")
You can do this with two different models as well, but you need to make sure they both "look the same" inside the UNION -- you can use select
on both queries to make sure they will produce the same columns.
topics = Topic.select('user_id AS author_id, description AS body, created_at')
comments = Comment.select('author_id, body, created_at')
Comment.from("(#{comments.to_sql} UNION #{topics.to_sql}) AS comments")
Solution 3:
Based on Olives' answer, I did come up with another solution to this problem. It feels a little bit like a hack, but it returns an instance of ActiveRelation
, which is what I was after in the first place.
Post.where('posts.id IN
(
SELECT post_topic_relationships.post_id FROM post_topic_relationships
INNER JOIN "watched" ON "watched"."watched_item_id" = "post_topic_relationships"."topic_id" AND "watched"."watched_item_type" = "Topic" WHERE "watched"."user_id" = ?
)
OR posts.id IN
(
SELECT "posts"."id" FROM "posts" INNER JOIN "news" ON "news"."id" = "posts"."news_id"
INNER JOIN "watched" ON "watched"."watched_item_id" = "news"."id" AND "watched"."watched_item_type" = "News" WHERE "watched"."user_id" = ?
)', id, id)
I'd still appreciate it if anybody has any suggestions to optimize this or improve the performance, because it's essentially executing three queries and feels a little redundant.
Solution 4:
You could also use Brian Hempel's active_record_union gem that extends ActiveRecord
with an union
method for scopes.
Your query would be like this:
Post.joins(:news => :watched).
where(:watched => {:user_id => id}).
union(Post.joins(:post_topic_relationships => {:topic => :watched}
.where(:watched => {:user_id => id}))
Hopefully this will be eventually merged into ActiveRecord
some day.