Can I use this library to work with Django's native bulk_update to update the modified field?

Question

Can I use this library to work with Django's native bulk_update to update the modified field?

simkimsia opened this issue 4 years ago · comments

related to your comment abt updating modified but only when values actually changed #3 (comment)

So in that issue, I gave an example of updating many records but with the same set of values.

In this issue, I am talking abt updating many records but each record may have a different set of new values. We will not know beforehand.

My code is currently like this

changes = [
  {
    'id': 1
    'field1': 'new_value11',
    'field2': 'new_value12',
  },
  {
    'id': 2
    'field1': 'new_value21',
    'field2': 'new_value22',
  },
]

to_be_updated = []
to_be_affected = Book.objects.filter(**some_filter_params).all()

# ... some function that will loop through to_be_affected and changes and when the ids match, then have their fields adopt the values in `changes` and then append to the list `to_be_updated`

Book.objects.bulk_update(to_be_updated, ['field1', 'field2']) # i use the native bulk_update

This works well so far so good. But my modified field is not updated

What if i stick to bulk_update

if i continue to use bulk_update i can add the modified field as well and set them all to datetime.now()

but this modified field will be changed even if the values are actually unchanged.

I can of course write more code to do the checks as well. But I think it's not elegant

Can I use model-values in some way?

Because of your previous comment, i realize it might be possible to do bulk changes but selectively update the modified field depending if the values themselves change.

I also don't mind dropping bulk_update if that helps.

A. Coady · Answer 1 · Thu May 14 2020 09:58:30 GMT+0800 (China Standard Time)

There a few different options here.

bulk_changed will find which rows have changed efficiently, but it only handles one field at a time. There could be a multi-field variant, but it would be an ugly interface.

for field in ('field1', 'field2'):
     data = {row['id']: row[field] for row in changes}
     diff = qs.bulk_changed(field, data)
    # bulk update keys from diff

update is extended to translate a dict into a case statement. So it's similar to the builtin bulk_update, but naturally with a data-oriented interface. By itself, it would over-count the rows, but works nicely with bulk_changed.

Finally there's bulk_change, which efficiently updates and returns the modified counts. It also is one field at a time, but more importantly works by inverting the data to use pk__in queries. This is much more efficient if the number of unique values is low, like an enum or bool.

I would start with bulk_changed and go from there.

KimSia Sim · Answer 2 · Sun May 17 2020 16:47:58 GMT+0800 (China Standard Time)

Thank you!