pepkit / geofetch

Builds a PEP from SRA or GEO accessions

Home Page:https://pep.databio.org/geofetch/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sanitize sample names

nleroy917 opened this issue · comments

Sample names coming from GEO can be messy. This breaks some functionality within other PEP tools. Sample names should not have any odd characters like slashes (/) or spaces.

It would be nice to have geofetch sanitize (or have an option to sanitize) sample names.

Example:
ATAC-seq from Luminal1/BMP k.o CAF-FOXA1 o.e tumor becomes ATAC-seq_from_Luminal1-BMP_k_o_CAF-FOXA1_o_e tumor

Maybe the flag name could be --sanitize-sample-names

See pepkit/pephub#40 for further discussion

I think this functionality has to be default, without any flag

I have added sanitize sample names functionality to geofetch. Can you please try to use it, and check if everything works correctly?

Looks good on my end