ChatGPT解决这个技术问题 Extra ChatGPT

Update MongoDB field using value of another field

In MongoDB, is it possible to update the value of a field using the value from another field? The equivalent SQL would be something like:

UPDATE Person SET Name = FirstName + ' ' + LastName

And the MongoDB pseudo-code would be:

db.person.update( {}, { $set : { name : firstName + ' ' + lastName } );

s
styvane

The best way to do this is in version 4.2+ which allows using the aggregation pipeline in the update document and the updateOne, updateMany, or update(deprecated in most if not all languages drivers) collection methods.

MongoDB 4.2+

Version 4.2 also introduced the $set pipeline stage operator, which is an alias for $addFields. I will use $set here as it maps with what we are trying to achieve.

db.collection.<update method>(
    {},
    [
        {"$set": {"name": { "$concat": ["$firstName", " ", "$lastName"]}}}
    ]
)

Note that square brackets in the second argument to the method specify an aggregation pipeline instead of a plain update document because using a simple document will not work correctly.

MongoDB 3.4+

In 3.4+, you can use $addFields and the $out aggregation pipeline operators.

db.collection.aggregate(
    [
        { "$addFields": { 
            "name": { "$concat": [ "$firstName", " ", "$lastName" ] } 
        }},
        { "$out": <output collection name> }
    ]
)

Note that this does not update your collection but instead replaces the existing collection or creates a new one. Also, for update operations that require "typecasting", you will need client-side processing, and depending on the operation, you may need to use the find() method instead of the .aggreate() method.

MongoDB 3.2 and 3.0

The way we do this is by $projecting our documents and using the $concat string aggregation operator to return the concatenated string. You then iterate the cursor and use the $set update operator to add the new field to your documents using bulk operations for maximum efficiency.

Aggregation query:

var cursor = db.collection.aggregate([ 
    { "$project":  { 
        "name": { "$concat": [ "$firstName", " ", "$lastName" ] } 
    }}
])

MongoDB 3.2 or newer

You need to use the bulkWrite method.

var requests = [];
cursor.forEach(document => { 
    requests.push( { 
        'updateOne': {
            'filter': { '_id': document._id },
            'update': { '$set': { 'name': document.name } }
        }
    });
    if (requests.length === 500) {
        //Execute per 500 operations and re-init
        db.collection.bulkWrite(requests);
        requests = [];
    }
});

if(requests.length > 0) {
     db.collection.bulkWrite(requests);
}

MongoDB 2.6 and 3.0

From this version, you need to use the now deprecated Bulk API and its associated methods.

var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;

cursor.snapshot().forEach(function(document) { 
    bulk.find({ '_id': document._id }).updateOne( {
        '$set': { 'name': document.name }
    });
    count++;
    if(count%500 === 0) {
        // Excecute per 500 operations and re-init
        bulk.execute();
        bulk = db.collection.initializeUnorderedBulkOp();
    }
})

// clean up queues
if(count > 0) {
    bulk.execute();
}

MongoDB 2.4

cursor["result"].forEach(function(document) {
    db.collection.update(
        { "_id": document._id }, 
        { "$set": { "name": document.name } }
    );
})

4.2+ Doesn't work. MongoError: The dollar ($) prefixed field '$concat' in 'name.$concat' is not valid for storage.
@JoshWoodcock, I think you had a typo in the query you are running. I suggest you double check.
For those running into the same problem @JoshWoodcock described: pay attention that the answer for 4.2+ describes an aggregation pipeline, so don't miss the square brackets in the second parameter!
Is it possible to do the same thing as in this solution but instead of concatenating two strings together, add two numbers together?
How many times are they going to change this before it becomes a joke?
e
evandrix

You should iterate through. For your specific case:

db.person.find().snapshot().forEach(
    function (elem) {
        db.person.update(
            {
                _id: elem._id
            },
            {
                $set: {
                    name: elem.firstname + ' ' + elem.lastname
                }
            }
        );
    }
);

What happens if another user has changed the document between your find() and your save()?
True, but copying between fields should not require transactions to be atomic.
It's important to notice that save() fully replaces the document. Should use update() instead.
How about db.person.update( { _id: elem._id }, { $set: { name: elem.firstname + ' ' + elem.lastname } } );
I created a function called create_guid that only produced a unique guid per document when iterating with forEach in this way (i.e. simply using create_guid in an update statement with mutli=true caused the same guid to be generated for all documents). This answer worked perfectly for me. +1
N
Niels van der Rest

Apparently there is a way to do this efficiently since MongoDB 3.4, see styvane's answer.

Obsolete answer below

You cannot refer to the document itself in an update (yet). You'll need to iterate through the documents and update each document using a function. See this answer for an example, or this one for server-side eval().


Is this still valid today?
@ChristianEngel: It appears so. I wasn't able to find anything in the MongoDB docs that mentions a reference to the current document in an update operation. This related feature request is still unresolved as well.
Is it still valid in April 2017? Or there are already new features which can do this?
@Kim It looks like it is still valid. Also, the feature request that @niels-van-der-rest pointed out back in 2013 is still in OPEN.
this is not a valid answer anymore, have a look at @styvane answer
E
Eric Kigathi

For a database with high activity, you may run into issues where your updates affect actively changing records and for this reason I recommend using snapshot()

db.person.find().snapshot().forEach( function (hombre) {
    hombre.name = hombre.firstName + ' ' + hombre.lastName; 
    db.person.save(hombre); 
});

http://docs.mongodb.org/manual/reference/method/cursor.snapshot/


What happens if another user edited the person between the find() and save()? I have a case where multiple calls can be done to the same object changing them based on their current values. The 2nd user should have to wait with reading until the 1st is done with saving. Does this accomplish that?
About the snapshot(): Deprecated in the mongo Shell since v3.2. Starting in v3.2, the $snapshot operator is deprecated in the mongo shell. In the mongo shell, use cursor.snapshot() instead. link
X
Xavier Guihot

Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update/creation of a field based on another field:

// { firstName: "Hello", lastName: "World" }
db.collection.update(
  {},
  [{ $set: { name: { $concat: [ "$firstName", " ", "$lastName" ] } } }],
  { multi: true }
)
// { "firstName" : "Hello", "lastName" : "World", "name" : "Hello World" }

The first part {} is the match query, filtering which documents to update (in our case all documents).

The second part [{ $set: { name: { ... } }] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias of $addFields.

Don't forget { multi: true }, otherwise only the first matching document will be updated.


Worked for me. Was assigning one field to another without concatenation and it worked. Thank you!
what is the difference between your first point and your third? {} mean all doc then why { multi: true }
@Coder17 the first part {} is a filtering part: for instance you might want to update doc1 and doc2 but not doc3. Without the third part, by default, the update will be applied only on 1 document for instance doc1, and doc2 wouldn't be updated. Note that you can also use db.collection.updateMany to get rid of the third parameter.
A
Aldo

Regarding this answer, the snapshot function is deprecated in version 3.6, according to this update. So, on version 3.6 and above, it is possible to perform the operation this way:

db.person.find().forEach(
    function (elem) {
        db.person.update(
            {
                _id: elem._id
            },
            {
                $set: {
                    name: elem.firstname + ' ' + elem.lastname
                }
            }
        );
    }
);

C
Chris Gibb

I tried the above solution but I found it unsuitable for large amounts of data. I then discovered the stream feature:

MongoClient.connect("...", function(err, db){
    var c = db.collection('yourCollection');
    var s = c.find({/* your query */}).stream();
    s.on('data', function(doc){
        c.update({_id: doc._id}, {$set: {name : doc.firstName + ' ' + doc.lastName}}, function(err, result) { /* result == true? */} }
    });
    s.on('end', function(){
        // stream can end before all your updates do if you have a lot
    })
})

How is this different? Will the steam be throttled by the update activity? Do you have any reference to it? The Mongo docs are quite poor.
Y
Yuvaraj Anbarasan

update() method takes aggregation pipeline as parameter like

db.collection_name.update(
  {
    // Query
  },
  [
    // Aggregation pipeline
    { "$set": { "id": "$_id" } }
  ],
  {
    // Options
    "multi": true // false when a single doc has to be updated
  }
)

The field can be set or unset with existing values using the aggregation pipeline.

Note: use $ with field name to specify the field which has to be read.


Only works MongoDB 4.2 and later. See: stackoverflow.com/a/37280419/404699
Thanks for pointing it out. The solution works for MongoDb 4.2+ versions
C
Chris Bloom

Here's what we came up with for copying one field to another for ~150_000 records. It took about 6 minutes, but is still significantly less resource intensive than it would have been to instantiate and iterate over the same number of ruby objects.

js_query = %({
  $or : [
    {
      'settings.mobile_notifications' : { $exists : false },
      'settings.mobile_admin_notifications' : { $exists : false }
    }
  ]
})

js_for_each = %(function(user) {
  if (!user.settings.hasOwnProperty('mobile_notifications')) {
    user.settings.mobile_notifications = user.settings.email_notifications;
  }
  if (!user.settings.hasOwnProperty('mobile_admin_notifications')) {
    user.settings.mobile_admin_notifications = user.settings.email_admin_notifications;
  }
  db.users.save(user);
})

js = "db.users.find(#{js_query}).forEach(#{js_for_each});"
Mongoid::Sessions.default.command('$eval' => js)

Y
Yi Xiang Chong

With MongoDB version 4.2+, updates are more flexible as it allows the use of aggregation pipeline in its update, updateOne and updateMany. You can now transform your documents using the aggregation operators then update without the need to explicity state the $set command (instead we use $replaceRoot: {newRoot: "$$ROOT"})

Here we use the aggregate query to extract the timestamp from MongoDB's ObjectID "_id" field and update the documents (I am not an expert in SQL but I think SQL does not provide any auto generated ObjectID that has timestamp to it, you would have to automatically create that date)

var collection = "person"

agg_query = [
    {
        "$addFields" : {
            "_last_updated" : {
                "$toDate" : "$_id"
            }
        }
    },
    {
        $replaceRoot: {
            newRoot: "$$ROOT"
        } 
    }
]

db.getCollection(collection).updateMany({}, agg_query, {upsert: true})

You don't need { $replaceRoot: { newRoot: "$$ROOT" } }; it means replacing the document by itself, which is pointless. If you replace $addFields by its alias $set and updateMany which is one of the aliases for update, then you get to the exact same answer as this one above.
Thanks @Xavier Guihot for the comment. The reason why $replaceRoot is used because we needed to use the $toDate query, while $set does not provide us a way to use $toDate
u
user1239961

(I would have posted this as a comment, but couldn't)

For anyone who lands here trying to update one field using another in the document with the c# driver... I could not figure out how to use any of the UpdateXXX methods and their associated overloads since they take an UpdateDefinition as an argument.

// we want to set Prop1 to Prop2
class Foo { public string Prop1 { get; set; } public string Prop2 { get; set;} } 

void Test()
{ 
     var update = new UpdateDefinitionBuilder<Foo>();
     update.Set(x => x.Prop1, <new value; no way to get a hold of the object that I can find>)
}

As a workaround, I found that you can use the RunCommand method on an IMongoDatabase (https://docs.mongodb.com/manual/reference/command/update/#dbcmd.update).

var command = new BsonDocument
        {
            { "update", "CollectionToUpdate" },
            { "updates", new BsonArray 
                 { 
                       new BsonDocument
                       {
                            // Any filter; here the check is if Prop1 does not exist
                            { "q", new BsonDocument{ ["Prop1"] = new BsonDocument("$exists", false) }}, 
                            // set it to the value of Prop2
                            { "u", new BsonArray { new BsonDocument { ["$set"] = new BsonDocument("Prop1", "$Prop2") }}},
                            { "multi", true }
                       }
                 }
            }
        };

 database.RunCommand<BsonDocument>(command);