SQLAlchemy - subquery in a WHERE clause

Solution 1:

This should work (different SQL, same result):

t = Session.query(
    Posts.user_id,
    func.max(Posts.post_time).label('max_post_time'),
).group_by(Posts.user_id).subquery('t')

query = Session.query(User, Posts).filter(and_(
    User.user_id == Posts.user_id,
    User.user_id == t.c.user_id,
    Posts.post_time == t.c.max_post_time,
))

for user, post in query:
    print user.user_id, post.post_id

Where c stands for 'columns'

Solution 2:

the previous answer works, but also the exact sql you asked for is written much as the actual statement:

print s.query(User, Posts).\
    outerjoin(Posts.user).\
    filter(Posts.post_time==\
        s.query(
            func.max(Posts.post_time)
        ).
        filter(Posts.user_id==User.user_id).
        correlate(User).
        as_scalar()
    )

I guess the "concept" that isn't necessarily apparent is that as_scalar() is currently needed to establish a subquery as a "scalar" (it should probably assume that from the context against ==).

Edit: Confirmed, that's buggy behavior, completed ticket #2190. In the current tip or release 0.7.2, the as_scalar() is called automatically and the above query can be:

print s.query(User, Posts).\
    outerjoin(Posts.user).\
    filter(Posts.post_time==\
        s.query(
            func.max(Posts.post_time)
        ).
        filter(Posts.user_id==User.user_id).
        correlate(User)
    )

Solution 3:

It is usually expressed similarly to the actual SQL - you create a subquery that returns single result and compare against that - however what sometimes can be real pain is if you have to use a table in the subquery that you are already querying or joining on.

Solution is to create an aliased version of the model to reference in the subquery.

So let's say you are already operating in a connection where you have an existing Posts model and some basic query ready - now, you'd want to query for the list of latest (single) post from each user, you'd filter the query like:

from sqlalchemy.orm import aliased
posts2 = aliased(Posts) # create aliased version

query = query.filter(
    model.post_id
    ==
    Posts.query # create query directly from model, NOT from the aliased version!
        .with_entities(posts2.post_id) # only select column "post_id"
        .filter(
            posts2.user_id == model.user_id
        )
        .order_by(posts2.post_id.desc()) # assume higher id == newer post
        .limit(1) # we must limit to a single row so we only get 1 value
)

I've purposedly did not use the func.max because I consider that a simpler version and it's already in other answers, this example I think will be useful to people that generally find this question because they are looking for a solution how to subquery the same table.