Add url asking for selecting an importer/downloader not working with batch option

Question

Add url asking for selecting an importer/downloader not working with batch option

Twix53791 opened this issue a year ago · comments

papis version : 0.13
The issue :

papis add -b https://www.sciencedirect.com/science/article/abs/pii/S1040618215011039

[ERROR] commands.add: No document is created, since no data or files have been found. Try providing a filename, an URL or use `--from [importer] [uri]` to extract metadata for the document.

The command is not working because no importer/downloader is selected. --from importer cannot work here because sciencedirect is a downloader.

Fix

I fixed it editing commands/add.py :

    if batch:
        edit = False
        confirm = False
        open_file = False

    only_data = bool(files) and not force_download
    matching_importers = papis.utils.get_matching_importer_by_name(
        from_importer, only_data=only_data)

-    if not from_importer and not batch and files:
+   if not from_importer and files:
        matching_importers = sum((
            papis.utils.get_matching_importer_or_downloader(f, only_data=only_data)
            for f in files), [])

-        if matching_importers:
+        if matching_importers and not batch:
            logger.info("These importers where automatically matched. "
                        "Select the ones you want to use.")

            matching_indices = papis.tui.utils.select_range(
                ["{} (files: {}) ".format(imp.name, ", ".join(imp.ctx.files))
                 for imp in matching_importers],
                "Select matching importers (for instance 0, 1, 3-10, a, all...)")

            matching_importers = [matching_importers[i] for i in matching_indices]
+        elif matching_importers:
+            if len(matching_importers) >= 2:
+                matching_importers = [matching_importers[1]]
+            else:
+                matching_importers = [matching_importers[0]]

    imported = papis.utils.collect_importer_data(
        matching_importers, batch=batch, only_data=only_data)
    ctx.data.update(imported.data)
    ctx.files.extend(imported.files)

It selects the downloader number 1 by default (here sciencedirect) when the batch option is set.

Alex Fikl · Answer 1 · Fri Jul 21 2023 21:46:41 GMT+0800 (China Standard Time)

Thank you for the report! The fix looks reasonable. Can you make a PR with the change? 😁

Twix · Answer 2 · Wed Jul 26 2023 04:42:53 GMT+0800 (China Standard Time)

yep, done!

Twix · Answer 3 · Sat Aug 12 2023 05:43:57 GMT+0800 (China Standard Time)

Ah, a small problem : in fact, listing the matching_importers in utils.py (def get_matching_importer_or_downloader), the fallback downloader (the default one if I understood) is not always the first. I don't know why the the fallback downloader is not the first one, and rather in the middle of the list. So, the --batch option sometimes select the first 'special' downloader, sometimes the fallback downloader. An erratic behavior. So I add a small line of code to set the fallback downloarder always at the beginning of the list (and at th 0. position when selecting importers).

                logger.info(
                    "{c.Back.BLACK}{c.Fore.GREEN}%s (%s) fetched data for query '%s'!"
                    "{c.Style.RESET_ALL}",
                    name, importer.name, uri)

-                result.append(importer)
+                if importer.name == "fallback":
+                    result.insert(0, importer)
+                else:
+                    result.append(importer)

    return result

Alex Fikl · Answer 4 · Sun Aug 20 2023 15:39:09 GMT+0800 (China Standard Time)

The original batch option seems to be fixed by #630.

I'm not sure why you're getting different insert orders in the downloaders though. They're sorted by priority and the fallback one has a priority of -1, so it should always be first. Feel free to open another issue with an example if you have it!