shz9 / viprs

Variational Inference of Polygenic Risk Scores

Home Page:https://shz9.github.io/viprs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Invalid LD Matrix: Element 0 does not have matching LD boundaries!

xinyu-c9 opened this issue · comments

Sorry to disturb!
I was trying to construct a shrinkage LD matrix using my genotype file. Here is my script:

import magenpy as mgp
gdl = mgp.GWADataLoader("./LD_reference/plink/plink",
                backend='plink')
gdl.compute_ld(estimator='shrinkage',
                genetic_map_ne=11400,
                genetic_map_sample_size=183,
                output_dir='./LD_reference/EUR/VIPRS')

I encountered the following error:

> Reading BED file...
Computing LD matrices:   0%|                                      | 0/22 [00:00<?, ?it/s]/software/conda/envs/wdl/lib/python3.7/site-packages/scipy/sparse/_index.py:125: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  self._set_arrayXarray(i, j, x)
Traceback (most recent call last):
  File "VIPRS_EURLD.py", line 7, in <module>
    output_dir='./LD_reference/EUR/VIPRS')
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/GWADataLoader.py", line 563, in compute_ld
    disable=not self.verbose or len(self.genotype) < 2)
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/GWADataLoader.py", line 560, in <dictcomp>
    for c, g in tqdm(sorted(self.genotype.items(), key=lambda x: x[0]),
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/ma
genpy/GenotypeMatrix.py", line 260, in compute_ld
    return ld_est.compute(output_dir, temp_dir=tmp_ld_dir.name)
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/stats/ld/estimator.py", line 227, in compute
    temp_dir)
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/stats/ld/estimator.py", line 90, in compute
    if _validate_ld_matrix(ld_mat):
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/stats/ld/utils.py", line 45, in _validate_ld_matrix
    raise ValueError(f"Invalid LD Matrix: Element {i} does not have matching LD boundaries!")
ValueError: Invalid LD Matrix: Element 0 does not have matching LD boundaries!
Computing LD matrices:   0%|                                      | 0/22 [01:07<?, ?it/s]

Do you know what caused this error, and how to fix it? If you need additional information, please feel free to ask me.
Thank you so much in advance!

commented

Hi Xinyu,

Thanks for reporting this bug. I've seen this issue reported before, however, I was not able to reproduce the error message on my end. Do you mind sharing more information about your system setup? Primarily, it would be great to know the python version, magenpy version, and plink version. It would also help if you could show how you set the plink path (if at all) before running the script.

Thank you,

Shadi

Hi Shadi,

Thanks for your reply!
The Python version I used is 3.7.12, the Magenpy version is 0.0.12, and the Plink version is v1.90b6.24.
For the full plink path, I'll send an email to you. Do you still use shadi.zabad@mail.utoronto.ca?

Thanks again for your help.

Best,
Xingyu

commented

Hi Xingyu,

Thank you for following up on this. I think I figured out the source of the bug.

In my experimentation and code development, I was testing the LD computation functionality with plink version v1.90b4.6 64-bit (15 Aug 2017) and I believe if you use this version, you won't see this bug anymore. I think in later versions of plink, such as the one you're using, the developers changed the default value for one of the LD-related flags, which meant that the output of the software is now different, which then breaks my code.

Specifically, the flag that was modified is --ld-window-r2, which sets the threshold used to decide what LD values to include in the output file. In future versions of magenpy, I will try to set this flag explicitly to be zero (i.e. --ld-window-r2 0), but for now, if you want a quick solution to your problem, I recommend using the plink version that I mentioned above (or any plink 1.9 from before 2019, I believe?).

Hope this solves the problem. If it doesn't, please let me know.

Hi Shadi,

I'm sorry for not getting back to you sooner. I've tried Plink v1.90b6.7 64-bit (2 Dec 2018), but this error still persists. This is the earliest version of plink v1.90 I can find. Could you please send me your Plink software so that I can retry? This is my email address: chenxy@big.ac.cn

Thank you so much for your help!

Hi Xinyu, I pushed a large update to both magenpy and viprs that should fix these bugs that you reported.
If the issue still persists, feel free to open the issue again.