icbi-lab / infercnvpy

Infer copy number variation (CNV) from scRNA-seq data. Plays nicely with Scanpy.

Home Page:https://infercnvpy.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TypeError: read_csv() got an unexpected keyword argument 'sep' in infercnvpy.io.genomic_position_from_gtf

huai-su opened this issue · comments

Report

when i start to use the first step in infercnv
infercnvpy.io.genomic_position_from_gtf('/mnt/f/sarcoma/Homo_sapiens.GRCh38.109.gtf.gz',adata=adata, gtf_gene_id='gene_name',inplace=True)
typerError occurs.
I create a new environment named infercnv, and just use 'pip install infercnvpy' to exclude other questions.
Thanks ,plz


TypeError                                 Traceback (most recent call last)
Cell In[46], line 1
----> 1 infercnvpy.io.genomic_position_from_gtf('/mnt/f/sarcoma/Homo_sapiens.GRCh38.109.gtf.gz',adata=adata, gtf_gene_id='gene_name',inplace=True)

File [~/miniconda3/envs/infercnv/lib/python3.9/site-packages/infercnvpy/io/_genepos.py:41](https://file+.vscode-resource.vscode-cdn.net/f%3A/sarcoma/scRNA/~/miniconda3/envs/infercnv/lib/python3.9/site-packages/infercnvpy/io/_genepos.py:41), in genomic_position_from_gtf(gtf_file, adata, gtf_gene_id, adata_gene_id, inplace)
     11 def genomic_position_from_gtf(
     12     gtf_file: Union[Path, str],
     13     adata: Union[AnnData, None] = None,
   (...)
     17     inplace: bool = True,
     18 ) -> Union[pd.DataFrame, None]:
     19     """Get genomic gene positions from a GTF file.
     20 
     21     The GTF file needs to match the genome annotation used for your single cell dataset.
   (...)
     39         If True, add the annotations directly to adata, otherwise return a dataframe.
     40     """
---> 41     gtf = gtfparse.read_gtf(gtf_file, usecols=["seqname", "feature", "start", "end", "gene_id", "gene_name"])
     42     gtf = (
     43         gtf.loc[
     44             gtf["feature"] == "gene",
   (...)
     48         .rename(columns={"seqname": "chromosome"})
     49     )
...
    119         filepath_or_buffer, 
    120         with_column_names=lambda cols: REQUIRED_COLUMNS,
    121         **kwargs).lazy()

TypeError: read_csv() got an unexpected keyword argument 'sep'

Version information


-----
anndata             0.9.1
infercnvpy          0.4.0
matplotlib          3.7.1
numpy               1.23.5
pandas              2.0.0
scanpy              1.9.3
scipy               1.10.1
seaborn             0.12.2
session_info        1.0.0
-----
PIL                         9.5.0
asttokens                   NA
backcall                    0.2.0
cycler                      0.10.0
cython_runtime              NA
dateutil                    2.8.2
debugpy                     1.5.1
decorator                   5.1.1
entrypoints                 0.4
executing                   1.2.0
gtfparse                    NA
h5py                        3.8.0
igraph                      0.10.4
importlib_resources         NA
ipykernel                   6.15.0
jedi                        0.18.2
joblib                      1.2.0
kiwisolver                  1.4.4
leidenalg                   0.9.1
llvmlite                    0.39.1
matplotlib_inline           0.1.6
mpl_toolkits                NA
natsort                     8.3.1
numba                       0.56.4
packaging                   23.1
parso                       0.8.3
pexpect                     4.8.0
pickleshare                 0.7.5
pkg_resources               NA
platformdirs                3.2.0
polars                      0.17.7
prompt_toolkit              3.0.38
psutil                      5.9.0
ptyprocess                  0.7.0
pure_eval                   0.2.2
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.6.0
pydevd_concurrency_analyser NA
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.15.1
pynndescent                 0.5.10
pyparsing                   3.0.9
pyreadr                     0.4.7
pytz                        2023.3
setuptools                  66.0.0
six                         1.16.0
sklearn                     1.2.2
stack_data                  0.6.2
statsmodels                 0.13.5
texttable                   1.6.7
threadpoolctl               3.1.0
tornado                     6.1
tqdm                        4.65.0
traitlets                   5.9.0
typing_extensions           NA
umap                        0.5.3
wcwidth                     0.2.6
zipp                        NA
zmq                         19.0.2
zoneinfo                    NA
-----
IPython             8.12.0
jupyter_client      7.0.6
jupyter_core        5.3.0
-----
Python 3.9.16 (main, Mar  8 2023, 14:00:05) [GCC 11.2.0]
Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35
-----
Session information updated at 2023-04-24 12:23

Thanks for reporting. This seems to be an upstream issue with gtfparse:
openvax/gtfparse#34

As a workaround, you could try to install the gtfparse patch from the linked PR:

pip install git+https://github.com/DriesSchaumont/gtfparse.git@sep_rename

Thank you so much for replying.
But the problem is the same.
After run ' pip install git+https://github.com/DriesSchaumont/gtfparse.git@sep_rename'
it comes with this:

Collecting git+https://github.com/DriesSchaumont/gtfparse.git@sep_rename
Cloning https://github.com/DriesSchaumont/gtfparse.git (to revision sep_rename) to /tmp/pip-req-build-0u6gkdry
Running command git clone --filter=blob:none --quiet https://github.com/DriesSchaumont/gtfparse.git /tmp/pip-req-build-0u6gkdry
Running command git checkout -b sep_rename --track origin/sep_rename
Switched to a new branch 'sep_rename'
Branch 'sep_rename' set up to track remote branch 'sep_rename' from 'origin'.
Resolved https://github.com/DriesSchaumont/gtfparse.git to commit 9bbfaf11dafb8f70d903f6b7596df3701b59de8d
Preparing metadata (setup.py) ... done
Requirement already satisfied: polars in /home/hxylinux/miniconda3/envs/infercnv/lib/python3.9/site-packages (from gtfparse==2.0.1) (0.17.7)
Requirement already satisfied: typing_extensions>=4.0.1 in /home/hxylinux/miniconda3/envs/infercnv/lib/python3.9/site-packages (from polars->gtfparse==2.0.1) (4.5.0)

Then I try to check the version information, gtfparse information is renewed.
But the bug is still here and the same.


anndata 0.9.1
gtfparse NA
infercnvpy 0.4.0
matplotlib 3.7.1
numpy 1.23.5
pandas 2.0.0
scanpy 1.9.3
scipy 1.10.1
seaborn 0.12.2
session_info 1.0.0

PIL 9.5.0
asttokens NA
backcall 0.2.0
cycler 0.10.0
cython_runtime NA
dateutil 2.8.2
debugpy 1.5.1
decorator 5.1.1
entrypoints 0.4
**executing 1.2.0
h5py 3.8.0
**
igraph 0.10.4
importlib_resources NA
ipykernel 6.15.0
jedi 0.18.2
joblib 1.2.0
kiwisolver 1.4.4
leidenalg 0.9.1
llvmlite 0.39.1
matplotlib_inline 0.1.6
mpl_toolkits NA
natsort 8.3.1
numba 0.56.4
packaging 23.1
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
pkg_resources NA
platformdirs 3.2.0
polars 0.17.7
prompt_toolkit 3.0.38
psutil 5.9.0
ptyprocess 0.7.0
pure_eval 0.2.2
pydev_ipython NA
pydevconsole NA
pydevd 2.6.0
pydevd_concurrency_analyser NA
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pygments 2.15.1
pynndescent 0.5.10
pyparsing 3.0.9
pyreadr 0.4.7
pytz 2023.3
setuptools 66.0.0
six 1.16.0
sklearn 1.2.2
stack_data 0.6.2
statsmodels 0.13.5
texttable 1.6.7
threadpoolctl 3.1.0
tornado 6.1
tqdm 4.65.0
traitlets 5.9.0
typing_extensions NA
umap 0.5.3
wcwidth 0.2.6
zipp NA
zmq 19.0.2
zoneinfo NA

IPython 8.12.0
jupyter_client 7.0.6
jupyter_core 5.3.0

Python 3.9.16 (main, Mar 8 2023, 14:00:05) [GCC 11.2.0]
Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35

Session information updated at 2023-04-25 11:05

Bug information is :
Output exceeds the size limit. Open the full output data in a text editor---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[12], line 1
----> 1 infercnvpy.io.genomic_position_from_gtf('/mnt/f/sarcoma/Homo_sapiens.GRCh38.109.gtf.gz',adata=adata, gtf_gene_id='gene_name',inplace=True)

File ~/miniconda3/envs/infercnv/lib/python3.9/site-packages/infercnvpy/io/_genepos.py:41, in genomic_position_from_gtf(gtf_file, adata, gtf_gene_id, adata_gene_id, inplace)
11 def genomic_position_from_gtf(
12 gtf_file: Union[Path, str],
13 adata: Union[AnnData, None] = None,
(...)
17 inplace: bool = True,
18 ) -> Union[pd.DataFrame, None]:
19 """Get genomic gene positions from a GTF file.
20
21 The GTF file needs to match the genome annotation used for your single cell dataset.
(...)
39 If True, add the annotations directly to adata, otherwise return a dataframe.
40 """
---> 41 gtf = gtfparse.read_gtf(gtf_file, usecols=["seqname", "feature", "start", "end", "gene_id", "gene_name"])
42 gtf = (
43 gtf.loc[
44 gtf["feature"] == "gene",
(...)
48 .rename(columns={"seqname": "chromosome"})
49 )
...
119 filepath_or_buffer,
120 with_column_names=lambda cols: REQUIRED_COLUMNS,
121 **kwargs).lazy()

TypeError: read_csv() got an unexpected keyword argument 'sep'

I am having the same issue, did you resolve it?

The main issue here is caused by a too high version of Polars library. Versions equal or greater than 0.16.14 renamed the sep parameter to separator, which caused the previous code to no longer function as intended.

To fix this problem, you need to uninstall the current version of Polars and install a lower version, such as 0.16.13, as in versions before this release, Polars used the parameter name sep.
pip uninstall polars,
then
pip install polars==0.16.13

Last
Restart your environment
openvax/gtfparse#34

pip install polars==0.16.13
It is worked!

Another way to solve this is to directly revise the read_gtf.py, in line 113,delete the "sep" parameter, which works for me.
read_gtf.zip

I pinned gtfparse<2 in the latest release which doesn't use polars. This should fix the issue.