YunxiaoRen / ML-iAMR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about data

anuradhawick opened this issue · comments

I want to understand the data hosted in this repo.

  1. Giessen_dataset folder has two files.
    • Is the cip_ctx_ctz_gen_pheno.csv file contain labels for resitance? 1=resistant and 0=susceptible?
    • Is the cip_ctx_ctz_gen_multi_data.csv file contain the transformed SNPs? A,C,G,T,N represented as 1,2,3,4,0?
  2. Is there a way to reproduce the CGR representation? Or have I missed something?

Really appreciate your support on this.

Thanks
A

Could you also make a comment on INDELs, there is some processing in 01_SNPs_calling.sh but not sure from paper how they were handled.

Also how did you treat multi-allelic variants? I see you are using -m flag in bcftools.

commented

我想了解此存储库中托管的数据。

  1. Giessen_dataset文件夹有两个文件。
    • 文件是否包含恢复标签? 和?cip_ctx_ctz_gen_pheno.csv``1=resistant``0=susceptible
    • 文件是否包含转换后的 SNP? 表示为 ?cip_ctx_ctz_gen_multi_data.csv``A,C,G,T,N``1,2,3,4,0
  2. 有没有办法重现 CGR 表示?还是我错过了什么?

非常感谢您对此的支持。

谢谢 A

Hello, could you let me know if you've successfully replicated FCGR? If so, could you please guide me through it? Thank you very much.