datalad / datalad

Keep code, data, containers under control with git and git-annex

Home Page:http://datalad.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`clone()` stores wrong relpath `url` in `.gitmodules`

mih opened this issue · comments

Here is the demo. I create two datasets, and then clone one into the other

C:\Users\mih>datalad create 2bsub
create(ok): C:\Users\mih\2bsub (dataset)

C:\Users\mih>datalad create top
create(ok): C:\Users\mih\top (dataset)

C:\Users\mih>cd top
C:\Users\mih\top>datalad clone -d. ..\2bsub subds1

C:\Users\mih\top>type .gitmodules
[submodule "subds1"]
        path = subds1
        url = ../../2bsub
        datalad-id = 2589d16c-89e6-4632-9c04-cca9a4874e23
        datalad-url = ../2bsub

The datalad-url property is correct and matches the clone source argument. The url property, which would be relevant for plain Git operations points to a non-existing location.

Nope, this is not windows. This is a general problem, and replicates in exactly the same fashion on Debian.

The culprit is this line

url = subm.get_remote_url(remote) if remote else None

It takes the URL from the just cloned (sub)dataset -- fine in general. However, if the URL is actually a relative path, this path is relative from the POV of the subdataset. However, it would need to be relative to the superdataset to be correct.

If it is a relative path, it would always be in POSIX notation, because we put it there like this.

git submodule does not allow for any of this. Relative paths are made absolute, and file:// is refused:

❯ git submodule add ../2bsub subds_git1
Cloning into '/tmp/top/subds_git1'...
fatal: transport 'file' not allowed
fatal: clone of '/tmp/2bsub' into submodule path '/tmp/top/subds_git1' failed

It is not the same issue, but I have a feeling that #7478 is related to this one