tommyblue / smugmug-backup

Makes a full backup of a SmugMug account

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suggestion: get the title, caption, keywords data for each file

martinrwilson opened this issue · comments

For example: at the end of downloading all the files, create a CSV file containing a row for each file and a column for:

  • the path of the file
  • title
  • caption
  • keywords
    (i.e. the metadata that can be entered against each file in SmugMug).
commented

This would be really handy, especially for captions, since it seems like although they are available via the API, for some unfathomable reason, SmugMug is not storing them as ICPT metadata like it would make sense to, but externally, making migration away from SmugMug really annoying if you do care about your captions :(

I could add a configuration value, like export_metadata=<bool>. When true, we could create a file (or multiple files?) with metadata in it.

It's not really my use-case, so I'm asking you @martinrwilson @casef : what do you think is the better option: a single csv file or a json file (or other format) for each file?

commented

I frankly have no idea. I figure a file per photo could be a bit easier to process with other tools, but it's not that I have a clue of how exactly I would proceed once I at least have the captions in a usable form independent on the Smugmug site itself ;)

commented

A de facto standard for importing into DAM (Digital Asset Management) solutions is a single CSV file containing the metadata for the set of files. A row contains the metadata for a file, with a column for each metadata field (e.g. "Caption", "Keywords") and a column identifying the file. If the filenames are unique in the set of exported files then the identifier could just be the filename. Otherwise, it could be the path of the file in the exported folder structure.

Having said that, a metadata file per exported file would be useful too, if that's easier.

I guess another option would be to use e.g. Exiftool or similar to embed the metadata into each file (as IPTC metadata). The downside of this is you'd be changing the file itself, so the export wouldn't provide exactly the same files that were uploaded originally (which is why many storage solutions don't use this technique, as well as it being slower to extract to show in the front end than from a database). If it's an optional config value then this could be a viable solution though.

commented

Here's an example metadata export file, showing a proposed format:
export.csv

commented

"I guess another option would be to use e.g. Exiftool or similar to embed the metadata into each file (as IPTC metadata)"

That's kinda what my intention was to at least try once I have the captions in a processable format off Smugmug. But I haven't even looked into how one would go about doing that just yet, my approach was kinda "we'll cross that bridge when (if) we get to it" ;)

(Needless to say I'm pretty annoyed there seems to be no official way to backup your site fully, especially given the current pricing.)

@martinrwilson @casef just released v1.4.0 with a new store.write_csv = true conf, test it please 🙏🏼

Not sure when I'll get to actually applying the metadata to the downloaded photos, but just want to say it all seems to get backed up correctly. Thank you so much!

commented

So, life interfered way more than I anticipated, so I only got to finally move my stuff from SmugMug recently. As far as this tool is concerned, it went more or less flawlessly. I did adjust the names of the individual columns in the CSV, just to make it easier for me:

SourceFile,Type,ArchivedUri,Description,Keywords,GPSLatitude,GPSLongitude

Then it was really easy to update the downloaded photos with Exiftool doing something like:

exiftool -csv=D:/Backup/Smugmug/metadata.csv D:/Backup/Smugmug/ -r

Though after this, I went through the photos once again with Exiftool and copied relevant tags from their XMP versions to ITCP as well (since the software I'm now using does not work well with XMP metadata, but has no trouble working with ITCP).

So once again thank you so much for making this possible!

Just a heads-up, though - SmugMug seems to have come up with yet another way to complicate a complete site backup for people. Thing is that apparently if you replace any of the photos with updated ones (and possibly even when you just upload new ones, I did not have it in me to test it properly), they strip out most of the metadata from the image and there's no way to get them back other than using their horribly ineffective album-by-album backup. I've tried several of the third-party tools, even those they specifically recommend in their knowledgebase, and even some paid ones, but none of them were able to get the actual original file with original metadata off SmugMug. It's still available, it shows next to the photo on the site, their download gives you the original photos, but third party tools do not.

I seriously have no idea what are they even thinking, it's a horrible approach to not even offer the means of doing a full backup in 2024, but even complicate things further for third party tools. But needless to say, between the several price hikes and stuff like this, I'm quite glad this will be my last month with them (after some 10 years or so). It's just a horrible shame IMO.

@casef just for personal curiosity, which platform did you choose to replace smugmug? I also tested some alternatives (google photos and amazon photos) but they don't offer full backup too. Moreover, at the end, their price is even higher in my case 😢

commented

@tommyblue I looked at quite a few options, but in the end I just went with a self-hosted Piwigo that I customized a bit to suit my needs and expectations (and in some ways, it already does that better than SM ever did). I already had a hosting available that I use for other projects, I just had to upgrade storage a bit to allow for the photo storage, but I still end up saving most of what I pay for SM nowadays. And since my hosting also does daily backups, there's little to worry about on that front too.