haessar / peaks2utr

A robust Python tool for the annotation of 3’ UTRs

Home Page:https://doi.org/10.1093/bioinformatics/btad112

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sqlite3.IntegrityError

adoptbai opened this issue · comments

Dear developer,
when I run peaks2utr, there is an error I can not solve, here is my log file:
2023-06-25 12:36:33,786 - INFO - Make .log directory
2023-06-25 12:36:33,787 - INFO - Make .cache directory
2023-06-25 12:36:33,787 - INFO - Splitting forward strand from out.bam.
2023-06-25 12:37:21,904 - INFO - Finished splitting forward strand.
2023-06-25 12:36:33,786 - INFO - Make .log directory
2023-06-25 12:36:33,787 - INFO - Make .cache directory
2023-06-25 12:36:33,787 - INFO - Splitting forward strand from out.bam.
2023-06-25 12:37:21,904 - INFO - Finished splitting forward strand.
2023-06-25 12:37:21,905 - INFO - Splitting reverse strand from out.bam.
2023-06-25 12:38:09,823 - INFO - Finished splitting reverse strand.
2023-06-25 12:38:09,828 - INFO - Merging SPAT outputs.
2023-06-25 12:38:09,828 - INFO - Filtering intervals with zero coverage.
[E::idx_find_and_load] Could not retrieve index file for '/home/jianyang/workspace/genome/L.dispar/annotation/07_utr/.cache/out.forward.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jianyang/workspace/genome/L.dispar/annotation/07_utr/.cache/out.reverse.bam'
2023-06-25 12:47:39,324 - INFO - Creating gff db.
2023-06-25 12:47:39,325 - INFO - Calling peaks for forward strand with MACS3.
2023-06-25 12:47:39,333 - INFO - Calling peaks for reverse strand with MACS3.
2023-06-25 12:36:33,786 - INFO - Make .log directory
2023-06-25 12:36:33,787 - INFO - Make .cache directory
2023-06-25 12:36:33,787 - INFO - Splitting forward strand from out.bam.
2023-06-25 12:37:21,904 - INFO - Finished splitting forward strand.
2023-06-25 12:37:21,905 - INFO - Splitting reverse strand from out.bam.
2023-06-25 12:38:09,823 - INFO - Finished splitting reverse strand.
2023-06-25 12:38:09,828 - INFO - Merging SPAT outputs.
2023-06-25 12:38:09,828 - INFO - Filtering intervals with zero coverage.
[E::idx_find_and_load] Could not retrieve index file for '/home/jianyang/workspace/genome/L.dispar/annotation/07_utr/.cache/out.forward.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jianyang/workspace/genome/L.dispar/annotation/07_utr/.cache/out.reverse.bam'
2023-06-25 12:47:39,324 - INFO - Creating gff db.
2023-06-25 12:47:39,325 - INFO - Calling peaks for forward strand with MACS3.
2023-06-25 12:47:39,333 - INFO - Calling peaks for reverse strand with MACS3.
2023-06-25 12:47:39,333 - INFO - Populating features
2023-06-25 12:47:39,333 - INFO - Populating features
Populating features table and first-order relations: 0 features^M2023-06-25 12:47:39,339 - INFO - Clearing cache.
Traceback (most recent call last):
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/gffutils/create.py", line 589, in _populate_from_lines
self._insert(f, c)
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/gffutils/create.py", line 530, in _insert
cursor.execute(constants._INSERT, feature.astuple())
sqlite3.IntegrityError: UNIQUE constraint failed: features.id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/jianyang/mambaforge-pypy3/envs/utr/bin/peaks2utr", line 8, in
sys.exit(main())
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/peaks2utr/init.py", line 78, in main
asyncio.run(_main(args))
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/asyncio/base_events.py", line 608, in run_until_complete
return future.result()
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/peaks2utr/init.py", line 158, in _main
db, _, _ = await asyncio.gather(
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/peaks2utr/preprocess.py", line 158, in create_db
await sync_to_async(gffutils.create_db)(
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/asgiref/sync.py", line 479, in call
ret: _R = await loop.run_in_executor(
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/asgiref/sync.py", line 538, in thread_handler
return func(*args, **kwargs)
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/gffutils/create.py", line 1292, in create_db
c.create()
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/gffutils/create.py", line 507, in create
self._populate_from_lines(self.iterator)
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/gffutils/create.py", line 591, in _populate_from_lines
fixed, final_strategy = self._do_merge(f, self.merge_strategy)
File "/home/jianyang/mambaforge-pypy3/envs/utr/lib/python3.8/site-packages/gffutils/create.py", line 226, in _do_merge
raise ValueError("Duplicate ID {0.id}".format(f))
ValueError: Duplicate ID cds.evm.model.chr2.1

How can I fix it?
Thanks
yours adoptbai

This is usually an indication that there are duplicated feature IDs in the reference gff file (see discussion #18). Could you tell me where you obtained your gff input from?

Thanks for your prompt reply!
Yes, This error Is indeed caused by GFF file,and I got it from EVM software. After I renamed the ID (cds.evm.model.chr2.1), it did work! Now it runs successfully!
Thanks once again for your kindness to my problem.