cumc / xqtl-protocol

Molecular QTL analysis protocol developed by ADSP Functional Genomics Consortium

Home Page:https://cumc.github.io/xqtl-protocol/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TensorQTL - cis results error

m-mews opened this issue · comments

I successfully ran the below code to generate the cis eQTL results; however, when I was uploading the results to a MySQL database I received a warning that one of the lines was truncated for 'se'. The data indicate that the beta for this variant was 0, pvalue 1, and no se is provided. Thoughts?

Run TensorQTL

sos run xqtl-pipeline/code/association_scan/TensorQTL/TensorQTL.ipynb cis
--genotype-file aa_233/geno/by_chr/aa_chr1_22.no_rel.filtered.plink_files_list.txt
--phenotype-file aa_233/mol_phe/tmm_per_chr/aa_233_gene_tpm_prep_v2.low_expression_filtered.outlier_removed.tmm.expression.bed.per_chrom.recipe
--covariate-file aa_233/cov/aa_233_gene_tpm_prep_v2.low_expression_filtered.outlier_removed.tmm.expression.aa_no_rel.covariate.pca.resid.PEER.gz
--cwd aa_233/tensorqtl/cis/rerun
--window 1000000
--phenotype_group False
--name aa_no_rel
--MAC 5
--container xqtl-pipeline/container/singularity/TensorQTL.sif

INFO: Running cis_1:
INFO: cis_1 (index=0) is completed.
INFO: cis_1 (index=5) is completed.
INFO: cis_1 (index=4) is completed.
INFO: cis_1 (index=14) is completed.
INFO: cis_1 (index=7) is completed.
INFO: cis_1 (index=12) is completed.
INFO: cis_1 (index=13) is completed.
INFO: cis_1 (index=15) is completed.
INFO: cis_1 (index=11) is completed.
INFO: cis_1 (index=3) is completed.
INFO: cis_1 (index=9) is completed.
INFO: cis_1 (index=2) is completed.
INFO: cis_1 (index=1) is completed.
INFO: cis_1 (index=16) is completed.
INFO: cis_1 (index=8) is completed.
INFO: cis_1 (index=18) is completed.
INFO: cis_1 (index=10) is completed.
INFO: cis_1 (index=19) is completed.
INFO: cis_1 (index=17) is completed.
INFO: cis_1 (index=6) is completed.
INFO: cis_1 (index=21) is completed.
INFO: cis_1 (index=20) is completed.
INFO: cis_1 output: /mnt/pan/Data14/metabrain_lasso/rna_seq_norm/aa_233/tensorqtl/cis/rerun/aa_no_rel.18.cis_qtl_pairs.18.parquet /mnt/pan/Data14/metabrain_lasso/rna_seq_norm/aa_233/tensorqtl/cis/rerun/aa_no_rel.18.emprical.cis_sumstats.txt... (66 items in 22 groups)
INFO: Running cis_2:
INFO: cis_2 is completed.
INFO: cis_2 output: /mnt/pan/Data14/metabrain_lasso/rna_seq_norm/aa_233/tensorqtl/cis/rerun/TensorQTL.cis._recipe.tsv /mnt/pan/Data14/metabrain_lasso/rna_seq_norm/aa_233/tensorqtl/cis/rerun/TensorQTL.cis._column_info.txt... (4 items)
INFO: Workflow cis (ID=wd9efd416f52eae69) is executed successfully with 2 completed steps and 23 completed substeps.

MariaDB [metabrain_lasso]> load data local infile 'aa_no_rel.1.norminal.cis_long_table.txt' into table aa_miami_b38_cis_tensorqtl IGNORE 1 lines;
Query OK, 15980554 rows affected, 1 warning (1 min 2.94 sec)
Records: 15980554 Deleted: 0 Skipped: 0 Warnings: 1

MariaDB [metabrain_lasso]> show warnings;
+---------+------+------------------------------------------------+
| Level | Code | Message |
+---------+------+------------------------------------------------+
| Warning | 1265 | Data truncated for column 'se' at row 15712499 |
+---------+------+------------------------------------------------+
1 row in set (0.00 sec)

(base) [mxm1368@hpc3 rerun]$ sed -n '15712498,15712501p' aa_no_rel.1.norminal.cis_long_table.txt
ENSG00000162851 chr1:246099581_T_C -466680 66 72 0.2481700405620251 -0.097505376 0.084173806 ENSG00000162851 0.16289593279361725 221 C T 246099581 chr1
ENSG00000162851 chr1:246099686_G_A -466575 41 43 0.5147902421233106 0.07195887 0.11026046 ENSG00000162851 0.09728506952524185 221 A G 246099686 chr1
ENSG00000162851 chr1:246099794_C_T -466467 47 50 1.0 0.0 ENSG00000162851 0.11312217265367508 221 T C 246099794 chr1
ENSG00000162851 chr1:246099927_C_T -466334 56 57 0.9702247314695975 -0.0037481217 0.10028176 ENSG00000162851 0.1289592981338501 221 T C 246099927 chr1

commented

@m-mews can you reliably reproduce this problem, eg after rerun the exact same command?

@hsun3163 i wonder if in your downstreams processing pipelines you ever see such "incomplete" result matrices. ...