Aggregate and update MongoDB -


i have 2 collections:

  • clients (6 000 000 documents)
  • orders (50 000 000 documents)

once day, calculate number of orders in past year, past month , past week, , such, client.

i tried this:

db.orders.aggregate(     {$match:          { date_order: { $gt: v_date1year } }     },     {$group : {         _id : "$id_client",          count : {$sum : 1}     }} ,     {         "$out": "tmp_indicators"     } )  db.tmp_indicators.find({}).foreach(function (my_client) {      db.clients.update (         {"id_client": my_client._id},         {"$set":              { "nb_orders_1year" : my_client.count }         }     ) }) 

i have 3 times, 1 past year aggregation, 1 past month , 1 past week. treatement slow, have idea of how perform in better way?

for improved performance when dealing large collections, take advantage of using bulk() api bulk updates sending operations server in batches (for example, batch size of 1000) gives better performance since won't sending every request server (as doing update statement within foreach() loop) once in every 1000 requests, making updates more efficient , quicker is.

the following examples demonstrate approach, first 1 uses bulk() api available in mongodb versions >= 2.6 , < 3.2. updates documents in clients collection changing nb_orders_1year fields values aggregation results.

since aggregate() method returns cursor, can use aggregation output collection's foreach() method iterate , access each document setting bulk update operations in batches send across server efficiently api:

var bulk = db.clients.initializeunorderedbulkop()     pipeline = [         {             "$match": { "date_order": { "$gt": v_date1year } }         },         {             "$group": {                 "_id": "$id_client",                  "count": { "$sum" : 1 }             }         },         { "$out": "tmp_indicators" }             ],     counter = 0; db.orders.aggregate(pipeline);   db.tmp_indicators.find().foreach(function (doc) {            bulk.find({ "_id": doc._id }).updateone({          "$set": { "nb_orders_1year": doc.count }     });      counter++;     if (counter % 1000 == 0) {         bulk.execute(); // execute per 1000 operations , re-initialize every 1000 update statements         bulk = db.clients.initializeunorderedbulkop();     } }); // clean remaining operations in queue if (counter % 1000 != 0) { bulk.execute(); } 

the next example applies new mongodb version 3.2 has since deprecated bulk api , provided newer set of apis using bulkwrite().

it uses same cursor above instead of iterating result, create array bulk operations using map() method:

 var pipeline = [         {             "$match": { "date_order": { "$gt": v_date1year } }         },         {             "$group": {                 "_id": "$id_client",                  "count": { "$sum" : 1 }             }         },         { "$out": "tmp_indicators" }             ]; db.orders.aggregate(pipeline); var bulkops = db.tmp_indicators.find().map(function (doc) {          return {              "updateone": {                  "filter": { "_id": doc._id } ,                               "update": { "$set": { "nb_orders_1year": doc.count } }              }                  };     });  db.clients.bulkwrite(bulkops, { "ordered": true }); 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -