Does SQLAlchemy have an equivalent of Django's get_or_create?

Solution 1:

Following the solution of @WoLpH, this is the code that worked for me (simple version):

def get_or_create(session, model, **kwargs):
    instance = session.query(model).filter_by(**kwargs).first()
    if instance:
        return instance
    else:
        instance = model(**kwargs)
        session.add(instance)
        session.commit()
        return instance

With this, I'm able to get_or_create any object of my model.

Suppose my model object is :

class Country(Base):
    __tablename__ = 'countries'
    id = Column(Integer, primary_key=True)
    name = Column(String, unique=True)

To get or create my object I write :

myCountry = get_or_create(session, Country, name=countryName)

Solution 2:

That's basically the way to do it, there is no shortcut readily available AFAIK.

You could generalize it ofcourse:

def get_or_create(session, model, defaults=None, **kwargs):
    instance = session.query(model).filter_by(**kwargs).one_or_none()
    if instance:
        return instance, False
    else:
        params = {k: v for k, v in kwargs.items() if not isinstance(v, ClauseElement)}
        params.update(defaults or {})
        instance = model(**params)
        try:
            session.add(instance)
            session.commit()
        except Exception:  # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
            session.rollback()
            instance = session.query(model).filter_by(**kwargs).one()
            return instance, False
        else:
            return instance, True

2020 update (Python 3.9+ ONLY)

Here is a cleaner version with Python 3.9's the new dict union operator (|=)

def get_or_create(session, model, defaults=None, **kwargs):
    instance = session.query(Model).filter_by(**kwargs).one_or_none()
    if instance:
        return instance, False
    else:
        kwargs |= defaults or {}
        instance = model(**params)
        try:
            session.add(instance)
            session.commit()
        except Exception:  # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
            session.rollback()
            instance = session.query(model).filter_by(**kwargs).one()
            return instance, False
        else:
            return instance, True

Note:

Similar to the Django version this will catch duplicate key constraints and similar errors. If your get or create is not guaranteed to return a single result it can still result in race conditions.

To alleviate some of that issue you would need to add another one_or_none() style fetch right after the session.commit(). This still is no 100% guarantee against race conditions unless you also use a with_for_update() or serializable transaction mode.

Solution 3:

I've been playing with this problem and have ended up with a fairly robust solution:

def get_one_or_create(session,
                      model,
                      create_method='',
                      create_method_kwargs=None,
                      **kwargs):
    try:
        return session.query(model).filter_by(**kwargs).one(), False
    except NoResultFound:
        kwargs.update(create_method_kwargs or {})
        created = getattr(model, create_method, model)(**kwargs)
        try:
            session.add(created)
            session.flush()
            return created, True
        except IntegrityError:
            session.rollback()
            return session.query(model).filter_by(**kwargs).one(), False

I just wrote a fairly expansive blog post on all the details, but a few quite ideas of why I used this.

  1. It unpacks to a tuple that tells you if the object existed or not. This can often be useful in your workflow.

  2. The function gives the ability to work with @classmethod decorated creator functions (and attributes specific to them).

  3. The solution protects against Race Conditions when you have more than one process connected to the datastore.

EDIT: I've changed session.commit() to session.flush() as explained in this blog post. Note that these decisions are specific to the datastore used (Postgres in this case).

EDIT 2: I’ve updated using a {} as a default value in the function as this is typical Python gotcha. Thanks for the comment, Nigel! If your curious about this gotcha, check out this StackOverflow question and this blog post.

Solution 4:

A modified version of erik's excellent answer

def get_one_or_create(session,
                      model,
                      create_method='',
                      create_method_kwargs=None,
                      **kwargs):
    try:
        return session.query(model).filter_by(**kwargs).one(), True
    except NoResultFound:
        kwargs.update(create_method_kwargs or {})
        try:
            with session.begin_nested():
                created = getattr(model, create_method, model)(**kwargs)
                session.add(created)
            return created, False
        except IntegrityError:
            return session.query(model).filter_by(**kwargs).one(), True
  • Use a nested transaction to only roll back the addition of the new item instead of rolling back everything (See this answer to use nested transactions with SQLite)
  • Move create_method. If the created object has relations and it is assigned members through those relations, it is automatically added to the session. E.g. create a book, which has user_id and user as corresponding relationship, then doing book.user=<user object> inside of create_method will add book to the session. This means that create_method must be inside with to benefit from an eventual rollback. Note that begin_nested automatically triggers a flush.

Note that if using MySQL, the transaction isolation level must be set to READ COMMITTED rather than REPEATABLE READ for this to work. Django's get_or_create (and here) uses the same stratagem, see also the Django documentation.