How to ignore nulls while unmarshalling a MongoDB document?

I would like to know if there's any approach that would allow me to ignore null types while unmarshalling a MongoDB document into a Go struct.

Right now I have some auto-generate Go structs, something like this:

type User struct {
  Name  string `bson:"name"`
  Email string `bson:"email"`
}

Changing the types declared in this struct is not an option, and here's the problem; in a MongoDB database, which I do not have total control, some of the documents have been inserted with null values were originally I was not expecting nulls. Something like this:

{
  "name": "John Doe",
  "email": null
}

As the string types declared inside my struct are not pointers, they can't receive a nil value, so whenever I try to unmarshall this document in my struct, it returns an error.

Preventing the insertion of this kind of document into the database would be the ideal solution, but for my use case, ignoring the null values would also be acceptable. So after unmarshalling the document my User instance would look like this

User {
  Name:  "John Doe",
  Email: "",
}

I'm trying to find, either some annotation flag, or an option that could be passed to the method Find/FindOne, or maybe even a query parameter to prevent returning any field containing null values from the database. Without any success until now.

Are there any built-in solutions in the mongo-go-driver for this problem?


Solution 1:

The problem is that the current bson codecs do not support encoding / decoding string into / from null.

One way to handle this is to create a custom decoder for string type in which we handle null values: we just use the empty string (and more importantly don't report error).

Custom decoders are described by the type bsoncodec.ValueDecoder. They can be registered at a bsoncodec.Registry, using a bsoncodec.RegistryBuilder for example.

Registries can be set / applied at multiple levels, even to a whole mongo.Client, or to a mongo.Database or just to a mongo.Collection, when acquiring them, as part of their options, e.g. options.ClientOptions.SetRegistry().

First let's see how we can do this for string, and next we'll see how to improve / generalize the solution to any type.

1. Handling null strings

First things first, let's create a custom string decoder that can turn a null into a(n empty) string:

import (
    "go.mongodb.org/mongo-driver/bson/bsoncodec"
    "go.mongodb.org/mongo-driver/bson/bsonrw"
    "go.mongodb.org/mongo-driver/bson/bsontype"
)

type nullawareStrDecoder struct{}

func (nullawareStrDecoder) DecodeValue(dctx bsoncodec.DecodeContext, vr bsonrw.ValueReader, val reflect.Value) error {
    if !val.CanSet() || val.Kind() != reflect.String {
        return errors.New("bad type or not settable")
    }
    var str string
    var err error
    switch vr.Type() {
    case bsontype.String:
        if str, err = vr.ReadString(); err != nil {
            return err
        }
    case bsontype.Null: // THIS IS THE MISSING PIECE TO HANDLE NULL!
        if err = vr.ReadNull(); err != nil {
            return err
        }
    default:
        return fmt.Errorf("cannot decode %v into a string type", vr.Type())
    }

    val.SetString(str)
    return nil
}

OK, and now let's see how to utilize this custom string decoder to a mongo.Client:

clientOpts := options.Client().
    ApplyURI("mongodb://localhost:27017/").
    SetRegistry(
        bson.NewRegistryBuilder().
            RegisterDecoder(reflect.TypeOf(""), nullawareStrDecoder{}).
            Build(),
    )
client, err := mongo.Connect(ctx, clientOpts)

From now on, using this client, whenever you decode results into string values, this registered nullawareStrDecoder decoder will be called to handle the conversion, which accepts bson null values and sets the Go empty string "".

But we can do better... Read on...

2. Handling null values of any type: "type-neutral" null-aware decoder

One way would be to create a separate, custom decoder and register it for each type we wish to handle. That seems to be a lot of work.

What we may (and should) do instead is create a single, "type-neutral" custom decoder which handles just nulls, and if the BSON value is not null, should call the default decoder to handle the non-null value.

This is surprisingly simple:

type nullawareDecoder struct {
    defDecoder bsoncodec.ValueDecoder
    zeroValue  reflect.Value
}

func (d *nullawareDecoder) DecodeValue(dctx bsoncodec.DecodeContext, vr bsonrw.ValueReader, val reflect.Value) error {
    if vr.Type() != bsontype.Null {
        return d.defDecoder.DecodeValue(dctx, vr, val)
    }

    if !val.CanSet() {
        return errors.New("value not settable")
    }
    if err := vr.ReadNull(); err != nil {
        return err
    }
    // Set the zero value of val's type:
    val.Set(d.zeroValue)
    return nil
}

We just have to figure out what to use for nullawareDecoder.defDecoder. For this we may use the default registry: bson.DefaultRegistry, we may lookup the default decoder for individual types. Cool.

So what we do now is register a value of our nullawareDecoder for all types we want to handle nulls for. It's not that hard. We just list the types (or values of those types) we want this for, and we can take care of all with a simple loop:

customValues := []interface{}{
    "",       // string
    int(0),   // int
    int32(0), // int32
}

rb := bson.NewRegistryBuilder()
for _, v := range customValues {
    t := reflect.TypeOf(v)
    defDecoder, err := bson.DefaultRegistry.LookupDecoder(t)
    if err != nil {
        panic(err)
    }
    rb.RegisterDecoder(t, &nullawareDecoder{defDecoder, reflect.Zero(t)})
}

clientOpts := options.Client().
    ApplyURI("mongodb://localhost:27017/").
    SetRegistry(rb.Build())
client, err := mongo.Connect(ctx, clientOpts)

In the example above I registered null-aware decoders for string, int and int32, but you may extend this list to your liking, just add values of the desired types to the customValues slice above.

Solution 2:

You can go through the operator $exists and Query for Null or Missing Fields for a detail explanation.

In the mongo-go-driver, you can try below query:

The email => nil query matches documents that either contains the email field whose value is nil or that do not contain the email field.

cursor, err := coll.Find(
   context.Background(),
   bson.D{
      {"email", nil},
})

You have to just add the $ne operator in the above query to get the records that do not have the field email or do not have the value nil in email. For more details about the operator $ne