mcveanlab / treeseq-inference

Work for the tree sequence inference paper.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cleanup required

jeromekelleher opened this issue · comments

As we're converging on publication, we should tidy things up so that others will be able to follow what we've done and reproduce it.

  1. Move tools we are using (ArgWeaver etc) into a directory called 'tools', and update paths accordingly.
  2. Delete any submodules we're not using (icytree, ftprime_ms ...)
  3. Delete msprime and tsinfer submodules. In practise it's much more robust to put a required version in to the code: we can insist on msprime version == 0.6.2 and tsinfer == 0.1.3. (For now we can install from git to run the analyses)
  4. Delete any code we're not using

The goal here is to make the simplest and cleanest repo that we can that would allow someone else to reproduce the plots in the paper exactly. If we want to keep code for other purposes we can make a copy of the repo. Once the paper is published this repo will be frozen.

How does this sound @hyanwong?

Yes, I think this sounds the right way forward. Alternatively we could open a new repo and copy stuff as required into there, saving the older one for further development? Happy either way. Also happy to work on this.

Also, I wonder if we should move to using empirical error for the plots instead of our basic method. Happy to implement that too.

Yes, I think this sounds the right way forward. Alternatively we could open a new repo and copy stuff as required into there, saving the older one for further development? Happy either way. Also happy to work on this.

Let's keep it in this one I think. This way the full history will be available. A good way to keep stuff for development would be to fork this repo to your own account. They can diverge as they like then.

Also, I wonder if we should move to using empirical error for the plots instead of our basic method. Happy to implement that too.

This is a separate issue. It's worth trying out the empirical error to see what happens. But lets discuss elsewhere and try and keep this issue on track.

Let's keep it in this one I think. This way the full history will be available. A good way to keep stuff for development would be to fork this repo to your own account. They can diverge as they like then.

OK, will do that. Agree about the empirical error too, although I thought I might implement that soon-ish (in the next few days)

It's looking great, thanks @hyanwong. I think there's a few more things we can clean up, and we can close this issue then:

  • Does the 'visualisations' directory do anything? I can't see us using this for the paper
  • Likewise for the 'tests' directory. Having a quick scan through it, most of the stuff in there is out of date and probably doesn't run. I don't think there's much point in having test cases here, so I would delete it.

Good point about 'visualisations' - I had forgotten I made that. I will delete it. Happy to kill tests too - I think that's mostly your code. Do you want to delete or shall I?

Please go ahead and delete it if you don't mind.

I also see that at the top level there is a file called ts_extras.txt which seems to be a more recent near-duplicate of src/ts_extras.py, so I've unified them and am testing now.