How to change datatype of nested field in Mongo document?

My Mongo structure as below,

"topProcesses" : [
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "1",
            "memoryUtilizationPercent" : "0.1",
            "command" : "init",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "2",
            "memoryUtilizationPercent" : "0.0",
            "command" : "kthreadd",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "3",
            "memoryUtilizationPercent" : "0.0",
            "command" : "ksoftirqd/0",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "5",
            "memoryUtilizationPercent" : "0.0",
            "command" : "kworker/0:+",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "6",
            "memoryUtilizationPercent" : "0.0",
            "command" : "kworker/u3+",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "8",
            "memoryUtilizationPercent" : "0.0",
            "command" : "rcu_sched",
            "user" : "root"
        } 
    ]

Now in above documents topProcesses.cpuUtilizationPercent is in string and I wanted to change topProcesses.cpuUtilizationPercent data type to Float. For this I tried below but it did not work

db.collectionName.find({
   "topProcesses":{"$exists":true}}).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++){
   db.collectionName.update({_id: data._id},{$set:{"topProcesses.$.cpuUtilizationPercent":parseFloat(data.topProcesses[ii].cpuUtilizationPercent)}},false,true);
  }
})

Can any one help how to changed string to float in nested Mongo documents


You are doing this the right way but you did not include the array element to match in the query portion of the .update():

db.collectionName.find({
   "topProcesses":{"$exists":true}}).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++) {
      db.collectionName.update(
         { 
             "_id": data._id, 
             "topProcesses.processId": data.topProcesses[ii].processId // corrected
         },
         {
             "$set": {
               "topProcesses.$.cpuUtilizationPercent":
                   parseFloat(data.topProcesses[ii].cpuUtilizationPercent)
             }
         }
      );
  }
})

So you need to match something in the array in order for the positional $ operator to have any effect.

You also could have just used the "index" value in the notation, since you are producing that in a loop anyway:

db.collectionName.find({
   "topProcesses":{"$exists":true}}).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++) {

      var updoc =  { 
          "$set": {}
      };

      var myKey = "topProcesses." + ii + ".cpuUtilizationPercent";
      updoc["$set"][myKey] = parseFloat(data.topProcesses[ii].cpuUtilizationPercent);

      db.collectionName.update(
         { 
             "_id": data._id
         },
         updoc
      );
  }
})

Which just uses the matching index and is handy where there is no unique identifier of the array element.

Also note that neither the "upsert" or "multi" options should apply here due to the nature of how this is processes existing documents.


Just as a "postscript" note to this, it is also worthwhile to consider the Bulk Operations API of MongoDB in versions from 2.6 and greater. Using these API methods you can significantly reduce the amount of network traffic between your client application and the database. The obvious improvement here is in the overall speed:

var bulk = db.collectionName.initializeOrderedBulkOp();
var counter = 0;

db.collectionName.find({
   "topProcesses":{"$exists":true}}
).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++) {

      var updoc =  { 
          "$set": {}
      };

      var myKey = "topProcesses." + ii + ".cpuUtilizationPercent";
      updoc["$set"][myKey] = parseFloat(data.topProcesses[ii].cpuUtilizationPercent);

      // queue the update
      bulk.find({ "_id": data._id }).update(updoc);
      counter++;

      // Drain and re-initialize every 1000 update statements
      if ( counter % 1000 == 0 ) {
          bulk.execute();
          bulk = db.collectionName.initializeOrderedBulkOp();
      }
  }
})

// Add the rest in the queue
if ( counter % 1000 != 0 )
    bulk.execute();

This basically reduces the amount of operations statements sent to the sever to only sending once every 1000 queued operations. You can play with that number and how things are grouped but it will give a significant increase in speed in a relatively safe way.