`str(GitTransportRI)` broken, and with it `_get_flexible_source_candidates()`
mih opened this issue · comments
This has been reported in the office hour. There is a super subdataset configuration, where the superdataset was cloned from a datalad-annex::
URL. Worked fine.
Now getting a subdataset fails, because a generated candidate URL is exactly the same as the superdataset remote URL.
Here is where it happens:
> /home/mih/env/datalad-dev/lib/python3.11/site-packages/datalad/distribution/utils.py(74)_get_flexible_source_candidates()
-> src = str(ri)
(Pdb) p ri
GitTransportRI(RI='file:///tmp/julia/demo_micro_datalad/newstore/QC/B31_4318-datalad?type=external&externaltype=uncurl&encryption=none&url={noquery}/{{annex_key}}', path='inputs/se4318', transport='datalad-annex')
(Pdb) p str(ri)
'datalad-annex::file:///tmp/julia/demo_micro_datalad/newstore/QC/B31_4318-datalad?type=external&externaltype=uncurl&encryption=none&url={noquery}/{{annex_key}}'
str(ri)
simply ignores the fact that there is a path='inputs/se4318'
.
Where other code layers should catch the resulting fall-out, generating such a candidate URL makes no sense to begin with.
I believe what should have been generated is
datalad-annex::file:///tmp/julia/demo_micro_datalad/newstore/QC/B31_4318-datalad/inputs/se4318?type=external&externaltype=uncurl&encryption=none&url={noquery}/{{annex_key}}