caporaso-lab / mockrobiota

A public resource for microbiome bioinformatics benchmarking using artificially constructed (i.e., mock) communities.

Home Page:http://mockrobiota.caporasolab.us

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Permanent "issues" or notes pages for each mock community dataset

nbokulich opened this issue · comments

We should create a permanent place for users to archive notes/tips/known issues for each dataset. E.g., some "issues" may not be errors, per se, but rather things like whether mapping barcodes need to be reverse complemented for demultiplexing, or whether header lines in raw data cause issues with specific software. Some of these may not be "issues" that need to be corrected, but rather documented in a permanent place for future users to follow. On the other hand, various other observations may be useful to share but are not "issues", e.g., whether specific samples in a dataset have low read counts post-QC and should be ignored.

@gregcaporaso what do you think?

I see 2 possibilities:

  1. create an issue request for each mock community and leave that issue open permanently. This will keep issues associated with a single MC organized in one place, and more comments can be added to that page as more notes/observations are made. It has the advantage that notes are added as comments without the need of a PR, so streamlines the process. It is disadvantaged by the fact that real issues will be intermixed with notes (even if we separate the "notes" page from real issues, things may get messy as they already are!), and the issue page cannot be closed when real issues are solved. Just having separate issues could get long and messy.

  2. create a notes file in the main directory for each dataset. Users will need to submit a PR to add permanent notes, though this could also help keep things tidy. Another disadvantage is that users would need to go looking for this, and the issues page is where most users will already be searching to find known issues.

I think this is a good idea, but I strongly prefer the second option. The reason being that everything is then in the repository, so if it were to ever move (e.g., from GitHub), or is being accessed offline, all of the relevant information would be included. GitHub issues also aren't great for this kind of thing because as they get long, valuable information can get lost in long discussion (where if the information is in a file, it's easy to make it more apparent). Also, note that it is possible to edit files and submit pull requests on GitHub in the web browser (via the file edit links), which could be used to simplify getting notes added.

Perhaps a README.md in the base directory for each dataset would be a good way to do this? These files could also contain the human-readable-description or other info and list known issues at the bottom.

That would work.

On Thu, Sep 8, 2016 at 12:55 PM, Nicholas Bokulich <notifications@github.com

wrote:

Perhaps a README.md in the base directory for each dataset would be a
good way to do this? These files could also contain the
human-readable-description or other info and list known issues at the
bottom.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#42 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AALvdFFImRFVfTzM0Y7IvoJgvoOb8OdRks5qoGhMgaJpZM4J4R6c
.

"Known issues / notes" are now included on the README.md page for each dataset.