ChatGPT解决这个技术问题 Extra ChatGPT

How to 'bulk update' with Django?

I'd like to update a table with Django - something like this in raw SQL:

update tbl_name set name = 'foo' where name = 'bar'

My first result is something like this - but that's nasty, isn't it?

list = ModelClass.objects.filter(name = 'bar')
for obj in list:
    obj.name = 'foo'
    obj.save()

Is there a more elegant way?

You might be looking for batch insert. Take a look at stackoverflow.com/questions/4294088/…
I don't like to insert new data - just update existing.
Maybe with help of select_for_update? docs.djangoproject.com/en/dev/ref/models/querysets/…
What is not nasty about the ModelClass approach? Then feed to Django as: stackoverflow.com/questions/16853649/…

F
Flimm

Update:

Django 2.2 version now has a bulk_update.

Old answer:

Refer to the following django documentation section

Updating multiple objects at once

In short you should be able to use:

ModelClass.objects.filter(name='bar').update(name="foo")

You can also use F objects to do things like incrementing rows:

from django.db.models import F
Entry.objects.all().update(n_pingbacks=F('n_pingbacks') + 1)

See the documentation.

However, note that:

This won't use ModelClass.save method (so if you have some logic inside it won't be triggered).

No django signals will be emitted.

You can't perform an .update() on a sliced QuerySet, it must be on an original QuerySet so you'll need to lean on the .filter() and .exclude() methods.


Note also that as a consequence of not using save(), DateTimeField fields with auto_now=True ("modified" columns) won't be updated.
But ModelClass.objects.filter(name = 'bar').update(name="foo") does not fulfill purpose of bulk update, if i have different data for different ids the how could i do that without using loop ?
@shihon I'm not sure if I got you right but I added example to the answer.
@Shashank have you found any solution for your case yet ? I am also having the same scenario .
F objects cannot be used to reference different models in the .update method... for example you can't use Entry.objects.all().update(title=F('blog__title')). Docs have a small mention of this. If you want to pull data from another model to update your entries, you'll have to run a for loop
n
nu everest

Consider using django-bulk-update found here on GitHub.

Install: pip install django-bulk-update

Implement: (code taken directly from projects ReadMe file)

from bulk_update.helper import bulk_update

random_names = ['Walter', 'The Dude', 'Donny', 'Jesus']
people = Person.objects.all()

for person in people:
    r = random.randrange(4)
    person.name = random_names[r]

bulk_update(people)  # updates all columns using the default db

Update: As Marc points out in the comments this is not suitable for updating thousands of rows at once. Though it is suitable for smaller batches 10's to 100's. The size of the batch that is right for you depends on your CPU and query complexity. This tool is more like a wheel barrow than a dump truck.


I tried django-bulk-update, and I personally discourage using it. What it does internally is to create a single SQL statement which looks like this: UPDATE "table" SET "field" = CASE "id" WHEN %s THEN %s WHEN %s THEN %s [...] WHERE id in (%s, %s, [...]);. This is kind of all right for few rows (when bulk updater is not needed), but with 10,000, the query is so complex, that postgres spends more time with CPU at 100% understanding the query, than the time it saves writing to disk.
@MarcGarcia good point. I found many developers use external libraries without knowing their impact
@MarcGarcia I disagree that bulk update is not valuable and only really needed when thousands of updates are necessary. Using it to do 10,000 rows at once is not advisable for the reasons you mentioned, but using it to update 50 rows at once is much more efficient than hitting the db with 50 separate update requests.
The best solutions I found are: a) use @transaction.atomic decorator, which improves performance by using a single transaction, or b) make a bulk insert in a temporary table, and then an UPDATE from the temporary table to the original one.
I know this is an old thread, but actually CASE/WHERE is not the only way. For PostgreSQL there are other approaches, but they are DB specific e.g. stackoverflow.com/a/18799497 However I am not sure if this is possible in ANSI SQL
T
Tim Tisdall

Django 2.2 version now has a bulk_update method (release notes).

https://docs.djangoproject.com/en/stable/ref/models/querysets/#bulk-update

Example:

# get a pk: record dictionary of existing records
updates = YourModel.objects.filter(...).in_bulk()
....
# do something with the updates dict
....
if hasattr(YourModel.objects, 'bulk_update') and updates:
    # Use the new method
    YourModel.objects.bulk_update(updates.values(), [list the fields to update], batch_size=100)
else:
    # The old & slow way
    with transaction.atomic():
        for obj in updates.values():
            obj.save(update_fields=[list the fields to update])

A
Andreas Bergström

If you want to set the same value on a collection of rows, you can use the update() method combined with any query term to update all rows in one query:

some_list = ModelClass.objects.filter(some condition).values('id')
ModelClass.objects.filter(pk__in=some_list).update(foo=bar)

If you want to update a collection of rows with different values depending on some condition, you can in best case batch the updates according to values. Let's say you have 1000 rows where you want to set a column to one of X values, then you could prepare the batches beforehand and then only run X update-queries (each essentially having the form of the first example above) + the initial SELECT-query.

If every row requires a unique value there is no way to avoid one query per update. Perhaps look into other architectures like CQRS/Event sourcing if you need performance in this latter case.


J
Jasir

Here is a useful content which i found in internet regarding the above question

https://www.sankalpjonna.com/learn-django/running-a-bulk-update-with-django

The inefficient way

model_qs= ModelClass.objects.filter(name = 'bar')
for obj in model_qs:
    obj.name = 'foo'
    obj.save()

The efficient way

ModelClass.objects.filter(name = 'bar').update(name="foo") # for single value 'foo' or add loop

Using bulk_update

update_list = []
model_qs= ModelClass.objects.filter(name = 'bar')
for model_obj in model_qs:
    model_obj.name = "foo" # Or what ever the value is for simplicty im providing foo only
    update_list.append(model_obj)
    
ModelClass.objects.bulk_update(update_list,['name'])

Using an atomic transaction

from django.db import transaction

with transaction.atomic():
    model_qs = ModelClass.objects.filter(name = 'bar')
    for obj in model_qs:
       ModelClass.objects.filter(name = 'bar').update(name="foo")

Any Up Votes ? Thanks in advance : Thank you for keep an attention ;)


Your bulk_update example makes no sense to me - why are you re-pulling the model_obj from the database? Same with the article you refer to - its bulk_update example doesnt need any db select at all, it just can do User.objects.bulk_update([User(id=k, score=v) for k, v in user_ids_dict.items()], ['score'])...
k
kathir raja

To update with same value we can simply use this

ModelClass.objects.filter(name = 'bar').update(name='foo')

To update with different values

list = ModelClass.objects.filter(name = 'bar')
obj_to_be_update = []
for obj in list:
    obj.name = "Dear "+obj.name
    obj_to_be_update.append(obj)
ModelClass.objects.bulk_update(obj_update_list, ['name'], batch_size=1000)

It won't trigger save signal every time instead we keep all the objects to be updated on the list and trigger update signal at once.


Your second case should still work with update, something like that: ModelClass.objects.filter(name='bar').update(name = 'Dear ' + F('name'))
yep, I agree but I mentioned loop here to add some more usabilities, not only concatenating strings, we can even do stuffs like computing, read data from other sources, do if-else inside loop and more
s
shivam sharma

IT returns number of objects are updated in table.

update_counts = ModelClass.objects.filter(name='bar').update(name="foo")

You can refer this link to get more information on bulk update and create. Bulk update and Create