ChatGPT解决这个技术问题 Extra ChatGPT

When saving, how can you check if a field has changed?

In my model I have :

class Alias(MyBaseModel):
remote_image = models.URLField(
    max_length=500, null=True,
    help_text='''
        A URL that is downloaded and cached for the image.
        Only used when the alias is made
    '''
)
    image = models.ImageField(
        upload_to='alias', default='alias-default.png',
        help_text="An image representing the alias"
    )

    
    def save(self, *args, **kw):
        if (not self.image or self.image.name == 'alias-default.png') and self.remote_image :
            try :
                data = utils.fetch(self.remote_image)
                image = StringIO.StringIO(data)
                image = Image.open(image)
                buf = StringIO.StringIO()
                image.save(buf, format='PNG')
                self.image.save(
                    hashlib.md5(self.string_id).hexdigest() + ".png", ContentFile(buf.getvalue())
                )
            except IOError :
                pass

Which works great for the first time the remote_image changes.

How can I fetch a new image when someone has modified the remote_image on the alias? And secondly, is there a better way to cache a remote image?


C
Cesar Canassa

Essentially, you want to override the __init__ method of models.Model so that you keep a copy of the original value. This makes it so that you don't have to do another DB lookup (which is always a good thing).

    class Person(models.Model):
        name = models.CharField()

        __original_name = None

        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self.__original_name = self.name

        def save(self, force_insert=False, force_update=False, *args, **kwargs):
            if self.name != self.__original_name:
                # name changed - do something here

            super().save(force_insert, force_update, *args, **kwargs)
            self.__original_name = self.name

instead of overwriting init, I'd use the post_init-signal docs.djangoproject.com/en/dev/ref/signals/#post-init
Overriding methods is recommended by the Django documentation: docs.djangoproject.com/en/dev/topics/db/models/…
@callum so that if you make changes to the object, save it, then makes additional changes and call save() on it AGAIN, it will still work correctly.
@Josh won't there be a problem if you have several application servers working against the same database as it only tracks changes in memory
@lajarre, I think your comment is a bit misleading. The docs suggest that you take care when you do so. They don't recommend against it.
i
iperelivskiy

I use following mixin:

from django.forms.models import model_to_dict


class ModelDiffMixin(object):
    """
    A model mixin that tracks model fields' values and provide some useful api
    to know what fields have been changed.
    """

    def __init__(self, *args, **kwargs):
        super(ModelDiffMixin, self).__init__(*args, **kwargs)
        self.__initial = self._dict

    @property
    def diff(self):
        d1 = self.__initial
        d2 = self._dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if v != d2[k]]
        return dict(diffs)

    @property
    def has_changed(self):
        return bool(self.diff)

    @property
    def changed_fields(self):
        return self.diff.keys()

    def get_field_diff(self, field_name):
        """
        Returns a diff for field if it's changed and None otherwise.
        """
        return self.diff.get(field_name, None)

    def save(self, *args, **kwargs):
        """
        Saves model and set initial state.
        """
        super(ModelDiffMixin, self).save(*args, **kwargs)
        self.__initial = self._dict

    @property
    def _dict(self):
        return model_to_dict(self, fields=[field.name for field in
                             self._meta.fields])

Usage:

>>> p = Place()
>>> p.has_changed
False
>>> p.changed_fields
[]
>>> p.rank = 42
>>> p.has_changed
True
>>> p.changed_fields
['rank']
>>> p.diff
{'rank': (0, 42)}
>>> p.categories = [1, 3, 5]
>>> p.diff
{'categories': (None, [1, 3, 5]), 'rank': (0, 42)}
>>> p.get_field_diff('categories')
(None, [1, 3, 5])
>>> p.get_field_diff('rank')
(0, 42)
>>>

Note

Please note that this solution works well in context of current request only. Thus it's suitable primarily for simple cases. In concurrent environment where multiple requests can manipulate the same model instance at the same time, you definitely need a different approach.


Really perfect, and do not perform extra query. Thanks a lot !
Any advice on how to ignore a type change? Its considering this a difference: {'field_name': (0L, u'0')}
@IMFletcher In your case you deal with uncleaned data assigned to a model field. This sort of thing is out of scope of this mixin. You may try first clean data with a model form that would populate your model fields for free on saving. Or manually, i.e. model_instance.field_name = model_form.cleaned_data['field_name']
Mixin is great, but this version has problems when used together with .only(). The call to Model.objects.only('id') will lead to infinite recursion if Model has at least 3 fields. To solve this, we should remove deferred fields from saving in initial and change _dict property a bit
Much like Josh's answer, this code will deceptively work fine on your single-process testing server, but the moment you deploy it to any sort of multi-processing server, it will give incorrect results. You can't know if you're changing the value in the database without querying the database.
r
radtek

Best way is with a pre_save signal. May not have been an option back in '09 when this question was asked and answered, but anyone seeing this today should do it this way:

@receiver(pre_save, sender=MyModel)
def do_something_if_changed(sender, instance, **kwargs):
    try:
        obj = sender.objects.get(pk=instance.pk)
    except sender.DoesNotExist:
        pass # Object is new, so field hasn't technically changed, but you may want to do something else here.
    else:
        if not obj.some_field == instance.some_field: # Field has changed
            # do something

Why is this the best way if the method that Josh describes above doesn't involve an extra database hit?
1) that method is a hack, signals are basically designed for uses like this 2) that method requires making alterations to your model, this one does not 3) as you can read in the comments on that answer, it has side-effects that can be potentially problematic, this solution does not
This way is great if you only care about catching the change just prior to saving. However, this won't work if you want to react to the change immediately. I have come across the latter scenario many times (and I'm working on one such instance now).
@Josh: What do you mean by "react to the change immediately"? In what way does this not let you "react"?
Sorry, I forgot the scope of this question and was referring to an entirely different problem. That said, I think signals are a good way to go here (now that they're available). However, I find many people consider overriding save a "hack." I don't believe this is the case. As this answer suggests (stackoverflow.com/questions/170337/…), I think overriding is the best practice when you're not working on changes that are "specific to the model in question." That said, I don't intend to impose that belief on anyone.
r
radtek

And now for direct answer: one way to check if the value for the field has changed is to fetch original data from database before saving instance. Consider this example:

class MyModel(models.Model):
    f1 = models.CharField(max_length=1)

    def save(self, *args, **kw):
        if self.pk is not None:
            orig = MyModel.objects.get(pk=self.pk)
            if orig.f1 != self.f1:
                print 'f1 changed'
        super(MyModel, self).save(*args, **kw)

The same thing applies when working with a form. You can detect it at the clean or save method of a ModelForm:

class MyModelForm(forms.ModelForm):

    def clean(self):
        cleaned_data = super(ProjectForm, self).clean()
        #if self.has_changed():  # new instance or existing updated (form has data to save)
        if self.instance.pk is not None:  # new instance only
            if self.instance.f1 != cleaned_data['f1']:
                print 'f1 changed'
        return cleaned_data

    class Meta:
        model = MyModel
        exclude = []

Josh's solution is much more database friendly. An extra call to verify what's changed is expensive.
One extra read before you do a write isn't that expensive. Also the tracking changes method doesn't work if there are multiple requests. Although this would suffer from a race condition in between fetching and saving.
Stop telling people to check pk is not None it doesn't apply for example if using a UUIDField. This is just bad advice.
@dalore you can avoid the race condition by decorating the save method with @transaction.atomic
@dalore although you'd need to make sure the transaction isolation level is sufficient. In postgresql, default is read committed, but repeatable read is necessary.
m
mcastle

Since Django 1.8 released, you can use from_db classmethod to cache old value of remote_image. Then in save method you can compare old and new value of field to check if the value has changed.

@classmethod
def from_db(cls, db, field_names, values):
    new = super(Alias, cls).from_db(db, field_names, values)
    # cache value went from the base
    new._loaded_remote_image = values[field_names.index('remote_image')]
    return new

def save(self, force_insert=False, force_update=False, using=None,
         update_fields=None):
    if (self._state.adding and self.remote_image) or \
        (not self._state.adding and self._loaded_remote_image != self.remote_image):
        # If it is first save and there is no cached remote_image but there is new one, 
        # or the value of remote_image has changed - do your stuff!

Thanks -- here's a reference to the docs: docs.djangoproject.com/en/1.8/ref/models/instances/…. I believe this still results in the aforementioned issue where the database may change between when this is evaluated and when the comparison is done, but this is a nice new option.
Rather than searching through values (which is O(n) based on number of values) wouldn't it be faster and clearer to do new._loaded_remote_image = new.remote_image ?
Unfortunately I have to reverse my previous (now deleted) comment. While from_db is called by refresh_from_db, the attributes on the instance (i.e. loaded or previous) are not updated. As a result, I can't find any reason why this is better than __init__ as you still need to handle 3 cases: __init__/from_db, refresh_from_db, and save.
L
Lee Hinde

Note that field change tracking is available in django-model-utils.

https://django-model-utils.readthedocs.org/en/latest/index.html


The FieldTracker from django-model-utils seems to work really well, thank you!
l
laffuste

If you are using a form, you can use Form's changed_data (docs):

class AliasForm(ModelForm):

    def save(self, commit=True):
        if 'remote_image' in self.changed_data:
            # do things
            remote_image = self.cleaned_data['remote_image']
            do_things(remote_image)
        super(AliasForm, self).save(commit)

    class Meta:
        model = Alias

r
ramwin

I am a bit late to the party but I found this solution also: Django Dirty Fields


Looking at the tickets, looks like this package is not in an healthy condition right now (looking for maintainers, needing to change their CI by december 31st, etc.)
A
Aaron McMillin

Another late answer, but if you're just trying to see if a new file has been uploaded to a file field, try this: (adapted from Christopher Adams's comment on the link http://zmsmith.com/2010/05/django-check-if-a-field-has-changed/ in zach's comment here)

Updated link: https://web.archive.org/web/20130101010327/http://zmsmith.com:80/2010/05/django-check-if-a-field-has-changed/

def save(self, *args, **kw):
    from django.core.files.uploadedfile import UploadedFile
    if hasattr(self.image, 'file') and isinstance(self.image.file, UploadedFile) :
        # Handle FileFields as special cases, because the uploaded filename could be
        # the same as the filename that's already there even though there may
        # be different file contents.

        # if a file was just uploaded, the storage model with be UploadedFile
        # Do new file stuff here
        pass

That's an awesome solution for checking if a new file was uploaded. Much better than checking the name against database beause the name of the file could be the same. You can use it in pre_save receiver, too. Thanks for sharing this!
Here's an example for updating audio duration in a database when the file was updated using mutagen for reading audio info - gist.github.com/DataGreed/1ba46ca7387950abba2ff53baf70fec2
N
Nimish Bansal

There is an attribute __dict__ which have all the fields as the keys and value as the field values. So we can just compare two of them

Just change the save function of model to the function below

def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
    if self.pk is not None:
        initial = A.objects.get(pk=self.pk)
        initial_json, final_json = initial.__dict__.copy(), self.__dict__.copy()
        initial_json.pop('_state'), final_json.pop('_state')
        only_changed_fields = {k: {'final_value': final_json[k], 'initial_value': initial_json[k]} for k in initial_json if final_json[k] != initial_json[k]}
        print(only_changed_fields)
    super(A, self).save(force_insert=False, force_update=False, using=None, update_fields=None)

Example Usage:

class A(models.Model):
    name = models.CharField(max_length=200, null=True, blank=True)
    senior = models.CharField(choices=choices, max_length=3)
    timestamp = models.DateTimeField(null=True, blank=True)

    def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
        if self.pk is not None:
            initial = A.objects.get(pk=self.pk)
            initial_json, final_json = initial.__dict__.copy(), self.__dict__.copy()
            initial_json.pop('_state'), final_json.pop('_state')
            only_changed_fields = {k: {'final_value': final_json[k], 'initial_value': initial_json[k]} for k in initial_json if final_json[k] != initial_json[k]}
            print(only_changed_fields)
        super(A, self).save(force_insert=False, force_update=False, using=None, update_fields=None)

yields output with only those fields that have been changed

{'name': {'initial_value': '1234515', 'final_value': 'nim'}, 'senior': {'initial_value': 'no', 'final_value': 'yes'}}

This works like a charm! You can also use that in pre_save signals where, if you need to make additional changes while updating the model itself, you can also make it race condition save as shown here.
A
Amichai Schreiber

As of Django 1.8, there's the from_db method, as Serge mentions. In fact, the Django docs include this specific use case as an example:

https://docs.djangoproject.com/en/dev/ref/models/instances/#customizing-model-loading

Below is an example showing how to record the initial values of fields that are loaded from the database


j
jhrs21

This works for me in Django 1.8

def clean(self):
    if self.cleaned_data['name'] != self.initial['name']:
        # Do something

Can you reference the documentation?
R
Robert Kajic

You can use django-model-changes to do this without an additional database lookup:

from django.dispatch import receiver
from django_model_changes import ChangesMixin

class Alias(ChangesMixin, MyBaseModel):
   # your model

@receiver(pre_save, sender=Alias)
def do_something_if_changed(sender, instance, **kwargs):
    if 'remote_image' in instance.changes():
        # do something

b
baqyoteto

Very late to the game, but this is a version of Chris Pratt's answer that protects against race conditions while sacrificing performance, by using a transaction block and select_for_update()

@receiver(pre_save, sender=MyModel)
@transaction.atomic
def do_something_if_changed(sender, instance, **kwargs):
    try:
        obj = sender.objects.select_for_update().get(pk=instance.pk)
    except sender.DoesNotExist:
        pass # Object is new, so field hasn't technically changed, but you may want to do something else here.
    else:
        if not obj.some_field == instance.some_field: # Field has changed
            # do something

D
Daniel Holmes

The optimal solution is probably one that does not include an additional database read operation prior to saving the model instance, nor any further django-library. This is why laffuste's solutions is preferable. In the context of an admin site, one can simply override the save_model-method, and invoke the form's has_changed method there, just as in Sion's answer above. You arrive at something like this, drawing on Sion's example setting but using changed_data to get every possible change:

class ModelAdmin(admin.ModelAdmin):
   fields=['name','mode']
   def save_model(self, request, obj, form, change):
     form.changed_data #output could be ['name']
     #do somethin the changed name value...
     #call the super method
     super(self,ModelAdmin).save_model(request, obj, form, change)

Override save_model:

https://docs.djangoproject.com/en/1.10/ref/contrib/admin/#django.contrib.admin.ModelAdmin.save_model

Built-in changed_data-method for a Field:

https://docs.djangoproject.com/en/1.10/ref/forms/api/#django.forms.Form.changed_data


S
SmileyChris

While this doesn't actually answer your question, I'd go about this in a different way.

Simply clear the remote_image field after successfully saving the local copy. Then in your save method you can always update the image whenever remote_image isn't empty.

If you'd like to keep a reference to the url, you could use an non-editable boolean field to handle the caching flag rather than remote_image field itself.


M
MYaser

I had this situation before my solution was to override the pre_save() method of the target field class it will be called only if the field has been changed
useful with FileField example:

class PDFField(FileField):
    def pre_save(self, model_instance, add):
        # do some operations on your file 
        # if and only if you have changed the filefield

disadvantage: not useful if you want to do any (post_save) operation like using the created object in some job (if certain field has changed)


3
3 revs

I have extended the mixin of @livskiy as follows:

class ModelDiffMixin(models.Model):
    """
    A model mixin that tracks model fields' values and provide some useful api
    to know what fields have been changed.
    """
    _dict = DictField(editable=False)
    def __init__(self, *args, **kwargs):
        super(ModelDiffMixin, self).__init__(*args, **kwargs)
        self._initial = self._dict

    @property
    def diff(self):
        d1 = self._initial
        d2 = self._dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if v != d2[k]]
        return dict(diffs)

    @property
    def has_changed(self):
        return bool(self.diff)

    @property
    def changed_fields(self):
        return self.diff.keys()

    def get_field_diff(self, field_name):
        """
        Returns a diff for field if it's changed and None otherwise.
        """
        return self.diff.get(field_name, None)

    def save(self, *args, **kwargs):
        """
        Saves model and set initial state.
        """
        object_dict = model_to_dict(self,
               fields=[field.name for field in self._meta.fields])
        for field in object_dict:
            # for FileFields
            if issubclass(object_dict[field].__class__, FieldFile):
                try:
                    object_dict[field] = object_dict[field].path
                except :
                    object_dict[field] = object_dict[field].name

            # TODO: add other non-serializable field types
        self._dict = object_dict
        super(ModelDiffMixin, self).save(*args, **kwargs)

    class Meta:
        abstract = True

and the DictField is:

class DictField(models.TextField):
    __metaclass__ = models.SubfieldBase
    description = "Stores a python dict"

    def __init__(self, *args, **kwargs):
        super(DictField, self).__init__(*args, **kwargs)

    def to_python(self, value):
        if not value:
            value = {}

        if isinstance(value, dict):
            return value

        return json.loads(value)

    def get_prep_value(self, value):
        if value is None:
            return value
        return json.dumps(value)

    def value_to_string(self, obj):
        value = self._get_val_from_obj(obj)
        return self.get_db_prep_value(value)

it can be used by extending it in your models a _dict field will be added when you sync/migrate and that field will store the state of your objects


H
Hassek

improving @josh answer for all fields:

class Person(models.Model):
  name = models.CharField()

def __init__(self, *args, **kwargs):
    super(Person, self).__init__(*args, **kwargs)
    self._original_fields = dict([(field.attname, getattr(self, field.attname))
        for field in self._meta.local_fields if not isinstance(field, models.ForeignKey)])

def save(self, *args, **kwargs):
  if self.id:
    for field in self._meta.local_fields:
      if not isinstance(field, models.ForeignKey) and\
        self._original_fields[field.name] != getattr(self, field.name):
        # Do Something    
  super(Person, self).save(*args, **kwargs)

just to clarify, the getattr works to get fields like person.name with strings (i.e. getattr(person, "name")


And it is still not making extra db queries?
I was trying to implement your code. It works ok by editing fields. But now i have problem with inserting new. I get DoesNotExist for my FK field in class. Some hint how to solve it will be appreciated.
I have just updated the code, it now skips the foreign keys so you don't need to fetch those files with extra queries (very expensive) and if the object doesn't exist it will skip the extra logic.
A
A. Kali

My take on @iperelivskiy's solution: on large scale, creating the _initial dict for every __init__ is expensive, and most of the time - unnecessary. I have changed the mixin slightly such that it records changes only when you explicitly tell it to do so (by calling instance.track_changes):

from typing import KeysView, Optional
from django.forms import model_to_dict

class TrackChangesMixin:
    _snapshot: Optional[dict] = None

    def track_changes(self):
        self._snapshot = self.as_dict

    @property
    def diff(self) -> dict:
        if self._snapshot is None:
            raise ValueError("track_changes wasn't called, can't determine diff.")
        d1 = self._snapshot
        d2 = self.as_dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if str(v) != str(d2[k])]
        return dict(diffs)

    @property
    def has_changed(self) -> bool:
        return bool(self.diff)

    @property
    def changed_fields(self) -> KeysView:
        return self.diff.keys()

    @property
    def as_dict(self) -> dict:
        return model_to_dict(self, fields=[field.name for field in self._meta.fields])

I've had a long term issue with django getting recursion errors (specificially RecursionError: Maximum Recursion Depth Exceeded) when trying to delete some objects and I've not been able to figure it out. Turns out it was ModelDiffMixin. Replaced with your version and now it works. So Happy!!!! Thanks.
i
icarus

I have found this package django-lifecycle. It uses django signals to define @hook decorator, which is very robust and reliable. I used it and it is a bliss.


While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review
S
Sion

How about using David Cramer's solution:

http://cramer.io/2010/12/06/tracking-changes-to-fields-in-django/

I've had success using it like this:

@track_data('name')
class Mode(models.Model):
    name = models.CharField(max_length=5)
    mode = models.CharField(max_length=5)

    def save(self, *args, **kwargs):
        if self.has_changed('name'):
            print 'name changed'

    # OR #

    @classmethod
    def post_save(cls, sender, instance, created, **kwargs):
        if instance.has_changed('name'):
            print "Hooray!"

If you forget super(Mode, self).save(*args, **kwargs) then you're disabling the save function so remember to put this in the save method.
The link of the article is outdated, this is the new link: cra.mr/2010/12/06/tracking-changes-to-fields-in-django
This answer is unfortunately incomplete. What is @track_data? The link no longer explains anything about this but redirects to the front page instead, which is why it's not good to depend on content in links being permanent.
t
theicfire

A modification to @ivanperelivskiy's answer:

@property
def _dict(self):
    ret = {}
    for field in self._meta.get_fields():
        if isinstance(field, ForeignObjectRel):
            # foreign objects might not have corresponding objects in the database.
            if hasattr(self, field.get_accessor_name()):
                ret[field.get_accessor_name()] = getattr(self, field.get_accessor_name())
            else:
                ret[field.get_accessor_name()] = None
        else:
            ret[field.attname] = getattr(self, field.attname)
    return ret

This uses django 1.10's public method get_fields instead. This makes the code more future proof, but more importantly also includes foreign keys and fields where editable=False.

For reference, here is the implementation of .fields

@cached_property
def fields(self):
    """
    Returns a list of all forward fields on the model and its parents,
    excluding ManyToManyFields.

    Private API intended only to be used by Django itself; get_fields()
    combined with filtering of field properties is the public API for
    obtaining this field list.
    """
    # For legacy reasons, the fields property should only contain forward
    # fields that are not private or with a m2m cardinality. Therefore we
    # pass these three filters as filters to the generator.
    # The third lambda is a longwinded way of checking f.related_model - we don't
    # use that property directly because related_model is a cached property,
    # and all the models may not have been loaded yet; we don't want to cache
    # the string reference to the related_model.
    def is_not_an_m2m_field(f):
        return not (f.is_relation and f.many_to_many)

    def is_not_a_generic_relation(f):
        return not (f.is_relation and f.one_to_many)

    def is_not_a_generic_foreign_key(f):
        return not (
            f.is_relation and f.many_to_one and not (hasattr(f.remote_field, 'model') and f.remote_field.model)
        )

    return make_immutable_fields_list(
        "fields",
        (f for f in self._get_fields(reverse=False)
         if is_not_an_m2m_field(f) and is_not_a_generic_relation(f) and is_not_a_generic_foreign_key(f))
    )

A
Antwane

Here is another way of doing it.

class Parameter(models.Model):

    def __init__(self, *args, **kwargs):
        super(Parameter, self).__init__(*args, **kwargs)
        self.__original_value = self.value

    def clean(self,*args,**kwargs):
        if self.__original_value == self.value:
            print("igual")
        else:
            print("distinto")

    def save(self,*args,**kwargs):
        self.full_clean()
        return super(Parameter, self).save(*args, **kwargs)
        self.__original_value = self.value

    key = models.CharField(max_length=24, db_index=True, unique=True)
    value = models.CharField(max_length=128)

As per documentation: validating objects

"The second step full_clean() performs is to call Model.clean(). This method should be overridden to perform custom validation on your model. This method should be used to provide custom model validation, and to modify attributes on your model if desired. For instance, you could use it to automatically provide a value for a field, or to do validation that requires access to more than a single field:"


J
Jiaaro

as an extension of SmileyChris' answer, you can add a datetime field to the model for last_updated, and set some sort of limit for the max age you'll let it get to before checking for a change


s
sknutsonsf

The mixin from @ivanlivski is great.

I've extended it to

Ensure it works with Decimal fields.

Expose properties to simplify usage

The updated code is available here: https://github.com/sknutsonsf/python-contrib/blob/master/src/django/utils/ModelDiffMixin.py

To help people new to Python or Django, I'll give a more complete example. This particular usage is to take a file from a data provider and ensure the records in the database reflect the file.

My model object:

class Station(ModelDiffMixin.ModelDiffMixin, models.Model):
    station_name = models.CharField(max_length=200)
    nearby_city = models.CharField(max_length=200)

    precipitation = models.DecimalField(max_digits=5, decimal_places=2)
    # <list of many other fields>

   def is_float_changed (self,v1, v2):
        ''' Compare two floating values to just two digit precision
        Override Default precision is 5 digits
        '''
        return abs (round (v1 - v2, 2)) > 0.01

The class that loads the file has these methods:

class UpdateWeather (object)
    # other methods omitted

    def update_stations (self, filename):
        # read all existing data 
        all_stations = models.Station.objects.all()
        self._existing_stations = {}

        # insert into a collection for referencing while we check if data exists
        for stn in all_stations.iterator():
            self._existing_stations[stn.id] = stn

        # read the file. result is array of objects in known column order
        data = read_tabbed_file(filename)

        # iterate rows from file and insert or update where needed
        for rownum in range(sh.nrows):
            self._update_row(sh.row(rownum));

        # now anything remaining in the collection is no longer active
        # since it was not found in the newest file
        # for now, delete that record
        # there should never be any of these if the file was created properly
        for stn in self._existing_stations.values():
            stn.delete()
            self._num_deleted = self._num_deleted+1


    def _update_row (self, rowdata):
        stnid = int(rowdata[0].value) 
        name = rowdata[1].value.strip()

        # skip the blank names where data source has ids with no data today
        if len(name) < 1:
            return

        # fetch rest of fields and do sanity test
        nearby_city = rowdata[2].value.strip()
        precip = rowdata[3].value

        if stnid in self._existing_stations:
            stn = self._existing_stations[stnid]
            del self._existing_stations[stnid]
            is_update = True;
        else:
            stn = models.Station()
            is_update = False;

        # object is new or old, don't care here            
        stn.id = stnid
        stn.station_name = name;
        stn.nearby_city = nearby_city
        stn.precipitation = precip

        # many other fields updated from the file 

        if is_update == True:

            # we use a model mixin to simplify detection of changes
            # at the cost of extra memory to store the objects            
            if stn.has_changed == True:
                self._num_updated = self._num_updated + 1;
                stn.save();
        else:
            self._num_created = self._num_created + 1;
            stn.save()

t
theTypan

If you do not find interest in overriding save method, you can do

  model_fields = [f.name for f in YourModel._meta.get_fields()]
  valid_data = {
        key: new_data[key]
        for key in model_fields
        if key in new_data.keys()
  }

  for (key, value) in valid_data.items():
        if getattr(instance, key) != value:
           print ('Data has changed')

        setattr(instance, key, value)

 instance.save()

M
Milo Persic

Sometimes I want to check for changes on the same specific fields on multiple models that share those fields, so I define a list of those fields and use a signal. In this case, geocoding addresses only if something has changed, or if the entry is new:

from django.db.models.signals import pre_save
from django.dispatch import receiver

@receiver(pre_save, sender=SomeUserProfileModel)
@receiver(pre_save, sender=SomePlaceModel)
@receiver(pre_save, sender=SomeOrganizationModel)
@receiver(pre_save, sender=SomeContactInfoModel)
def geocode_address(sender, instance, *args, **kwargs):

    input_fields = ['address_line', 'address_line_2', 'city', 'state', 'postal_code', 'country']

    try:
        orig = sender.objects.get(id=instance.id)
        if orig:
            changes = 0
            for field in input_fields:
                if not (getattr(instance, field)) == (getattr(orig, field)):
                    changes += 1
            if changes > 0:
                # do something here because at least one field changed...
                my_geocoder_function(instance)
    except:
        # do something here because there is no original, or pass.
        my_geocoder_function(instance)

Writing it once and attaching with "@receiver" sure beats overriding multiple model save methods, but perhaps some others have better ideas.