sckott / habanero

client for Crossref search API

Home Page:https://habanero.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature request: progress indicator when deep paging

gorbynet opened this issue · comments

When systematically retrieving large data sets, it would be useful to have some way of measuring progress through the data harvest, e.g. show how many records have been retrieved, and how many there are in total, while the retrieval is ongoing.

thanks @gorbynet

I assume you mean with deep paging? looking into it, not done progress bars before in python

clint is another option

Scott, thanks for picking this up. I'm not sure how it would work (I'm quite inexperienced at Python) but what I meant was that it would be useful to have some way of showing progress when deep paging through a large dataset. That might be a progress bar, or some way of the crossref module feeding back to the calling script so that the script can choose how to reflect that information.

thanks, i'll experiment and ask for your feedback

this is more complicated than i thought. we need i think to have an idea of how many requests we'll make for a progress bar to work.

we're doing requests in a while loop https://github.com/sckott/habanero/blob/master/habanero/request_class.py#L68-L73 - we need to look at total results found by the server and also cursor_max

okay, install from progress-bar

pip3 install --user https://github.com/sckott/habanero/archive/progress-bar.zip

and try e.g.,

from habanero import Crossref
cr = Crossref()
res = cr.works(query = "octopus", cursor = "*", limit = 500, progress_bar = True)

let me know what you think

the progress bar is not integrated into requests - it's only activeated when deep paging - it's just a progress bar on the while loop that uses an estimate of how many requests will be done

any thoughts on this @gorbynet ?

Apologies - this my personal Github account, and I only use the Crossref library for work, and haven't looked at it for months. I've got an audit that needs the deep paging, so I'll have a look at this now.

Hi @sckott , that works perfectly. Thank you!

great, glad it works. I'll merge this into master soon