DanAtkinson / Fuskr

Fuskr - an image gallery extension for Google Chrome

Home Page:http://danatkinson.github.io/Fuskr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not detecting 404s on certain pages

mattman00000 opened this issue · comments

On some sites I get a 404 in wget but fuskr still loads an image. Examples include I think all of foto.my.mail.ru

Please can you provide an example of a 404 image url?

@jbolster,

Any ideas? I can see the problem but I don't think jQLite returns the statuscode on the 'load' event so I don't see how we could interrogate the response from the server.

Currently, we only care if the file has been loaded, not whether it's been loaded but with an error.

Example Fusk url:
chrome-extension://balbojkopkiehjjnmpohcobpejmioppl/Html/images.htm#/fusk/http://content.foto.mail.ru/mail/thereisnowhere/_myphoto/s-[1-3].jpg

Thanks, Dan

I did make a stopgap bookmarklet a while back (no idea if it works with the Angular update) to remove images the same dimensions as a selected image

javascript:if ((getSelection.rangeCount==1)||(getSelection().getRangeAt(0).commonAncestorContainer.getElementsByTagName("img").length!=0)){var twidth = getSelection().getRangeAt(0).commonAncestorContainer.getElementsByTagName("img")[0].width;var theight = getSelection().getRangeAt(0).commonAncestorContainer.getElementsByTagName("img")[0].height;var fuskImages = document.getElementsByClassName("fuskImage");for (var i = 0;i<fuskImages.length;i++){if ((fuskImages[i].width==twidth)&&(fuskImages[i].height==theight)){fuskImages[i].parentNode.parentNode.className="hide wrap error" ;}}}else {alert("select a picture first");}

I had a thought that it might not be a bad idea to have some form of "remove images whose dimensions (or perhaps other properties) match some certain criteria" functionality anyway. Whether it's fancy upper and lower bound sliders like those of "Image Downloader" (chrome extension id cnpniohnfphhjihaiiggeabnkjhpaldj) or just width and height comparison operator dropdowns and an HTML5 number input field (and a and/or dropdown wouldn't hurt either). I could give it a try but I'm a bit rusty on chrome extensions and I'm tied up until some time in May.

Hi Matt,

This is a really tricky thing to do properly and I actually considered this a few years ago to handle hotlink warnings. Ultimately though, Fuskr seemed to 'beat' hotlink checks so I didn't bother implementing it.

I'm all for adding the functionality but it seems like this other extension (GitHub project page) (which I'd never seen before today!) does it in a way I probably wouldn't have. I think filtering by dimensions is okay, but there are probably other ways of doing it, like grabbing a hash of an image and removing images with a duplicate hash without the requirement for user intervention. Possibly an overhead here though.

Also, it allows the user to create a folder when downloading. I didn't think that was possible in the API. I may have to look at that functionality a little further. I like the popup idea, but couldn't implement it in Fuskr due to Google's 'single purpose' rules on extensions which this might break.

Thanks, Dan

image-downloader GitHub project page

This might actually be possible.

Originally I was commenting that it wouldn't be possible due to CORS, but according to the XHR documentation extensions aren't limited in the same way (hurrah) as long as you specify the sites to have permission for. And we happen to already have the permissions for all sites.

This surprised me, but it's great to know!

So I'm thinking of these options:

  1. Request the images via JS first. Displaying them as we are currently (there will be a second request per image but it will hit the cache).
  2. Do what we do already and before it counts as 'success', do a quick request for the image to get the headers.
  3. Request the image, stick the data in a canvas.

I'm thinking of option 2 here. If we do a request after the image has loaded, then it will go straight to the cache anyway, and just acts as a confirmation.

I looked at option 1 previously which does sound like option 2. Both involve more than one request. If you're just doing a HEAD request, that should be quick and return the relevant status, but this feels like we're just trying to get around the problem by making twice the number of requests.

Canvas is one way to go and would allow us to do hashing and some rudimentary image analysis more easily, but I imagine that the cost is quite high.

Option 1 is JS before the img element makes the request
Option 2 is allowing the img tag to load before failing/successing

Both do make twice the number of requests, but if the image was previously successful then it would just hit the cache anyway. This is why option 2 is now out for me, as the browser doesn't look at the cache when it previously failed (so an actual second network request).

With option 3, we could just stick the response data in an image element which could work: http://stackoverflow.com/a/10687544/261677

The fact that we can actually make these CORS requests in JS opens up a new option, of potentially zipping up images before saving (though that's another issue altogether)

That's fine though. We don't need to perform requests on failures - just successes (loaded image) that return a 4xx (client error) or 5xx (server error). So we just check whether the status code is between 400-600.

2xx (success) or 3xx (redirection) should be considered successful along with everything else.

Re your zipping suggestion, what do you mean? Would you mind breaking it out into a separate issue or perhaps discuss in Gitter?

Fixed by #20.

I've looked at how this can be fixed again and have made a small change to detect data types. I've found a couple of cases where calling non-existent images results in a 200 response, but the response is actually a web page which can't be rendered as an image.

We could/should also check for blob data size as well, but for now, this should stop image galleries being loaded where the response is clearly invalid.