alphagov / asset-manager

Manages uploaded assets (images, PDFs etc.) for applications on GOV.UK

Home Page:https://docs.publishing.service.gov.uk/apps/asset-manager.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Investigate replaced Whitehall attachment on discarded edition

floehopper opened this issue · comments

When trying to check that the Whitehall attachment metadata had been correctly set in Asset Manager, @chrislo and I came across the following example:

A request to the following URL serves the attachment from Whitehall (i.e. the underlying file system) with a 200 OK response:

https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/251493/Audit_committees_consultation.pdf

A request to the following URL (after signing in) does a 301 Moved Permanently redirect to a replacement attachment served from Asset Manager:

https://draft-assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/310014/DH_Consultation_on_NHS_Trust.pdf

Initial investigation suggests that the following has happened in Whitehall:

  1. Draft edition created
  2. Attachment added
  3. Edition published
  4. New draft edition created
  5. Attachment replaced
  6. Draft edition discarded

The reason Whitehall is (correctly) serving the original attachment and not redirecting is because the replacement AttachmentData is associated with an edition in the deleted state and so the call to BaseAttachmentsController#attachment_visible? from AttachmentsController#show (correctly) returns true.

The reason Asset Manager is (incorrectly) redirecting to the replacement attachment is that the Asset Manager asset has a replacement and we are making the request via the draft-assets host, i.e. this condition in WhitehallMediaController#download is true.

One thing that confuses me is that I thought that discarding a draft edition resulted in all its attachments being marked as deleted. However, the Attachment associated with the replacement AttachmentData does not appear to be marked as deleted. I don't really know how this can have happened. However, it does at least explain why the replacement attachment asset is not marked as deleted in Asset Manager.

Even if the replacement attachment asset was marked as deleted I think Asset Manager might still attempt to do the redirect, but I think it would raise an exception, because asset.replacement would return nil in BaseMediaController#redirect_to_replacement_for.

Note that I had a go at extending the relevant integration test to include this scenario in this branch, although I'm not sure it's quite right.

I think there might be an issue with AttachmentData#deleted?, because it looks as if it only takes into account whether the edition has been deleted; I think it should also take into account whether the Attachment is marked as deleted, i.e. when a single attachment is deleted and the edition is not deleted.

I think there might be an issue with AttachmentData#deleted?, because it looks as if it only takes into account whether the edition has been deleted; I think it should also take into account whether the Attachment is marked as deleted, i.e. when a single attachment is deleted and the edition is not deleted.

I don't think the above statement about AttachmentData#deleted? is true after all. I think I must've been confused by the mention of "attachable" in the include_deleted_attachables option passed to AttachmentData#significant_attachment.

In fact, as the name clearly suggests, the latter returns an instance of an Attachment and thus AttachmentData#deleted? does take account of whether the attachment has been marked as deleted.

I thought this was worth clarifying, but note that this doesn't resolve the original problem as described in the issue description.

I've now gone through this myself.

I've followed the steps above on integration and could not reproduce the issue. Everything in whitehall's DB wound up as you would expect.

I do agree that the attachments listed above ended up in an odd non-deleted state while the edition itself is marked as deleted which appears impossible due to the code linked above which deletes all attachments when the edition is deleted

However, the code for this asset deletion was added 23 March 2016, while the edition in question was discarded 9 May 2014. The commit where the code was added calls this situation out specifically, though it leaves out the asset manager issue as it was presumably not a thing for whitehall at the time.

I'm guessing that when the assets were imported from whitehall to asset manager this discrepancy was copied over accidentally.

Closing this as the investigation is complete and we have a card to complete the follow up work.