loosolab / TOBIAS

Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BINDetect "ERROR IN REGION"

liz-is opened this issue · comments

Hi, unfortunately after solving my issue with the ATACorrect step I'm running into a problem with BINDetect for the same data!

I get "ERROR IN REGION" for the first region in the bed file. I've previously used the same bed file for other conditions without issue. Looking at the code, I think the issue might be that there's no data in this region of the genome for one of the conditions. This is very plausible as I'm running TOBIAS on scATAC-seq data with aggregated data within different clusters as the different conditions.

Can you suggest a way I could pre-filter the regions to remove ones that don't have enough data in these clusters? There are definitely more regions with that give this same error - when running using multiple processes I see more instances of this error. Or would it be possible for TOBIAS to handle these a different way, e.g. by skipping and moving to the next region?

Here's the relevant part of the logs, please let me know if you need any more information!

2023-11-08 10:37:08 (68501) [INFO]      Progress done!

2023-11-08 10:37:08 (68501) [DEBUG]     Getting base64 strings per motif

2023-11-08 10:37:10 (68501) [DEBUG]     Starting logger queue for multiprocessing
2023-11-08 10:37:10 (74265) [DEBUG]     Started main logger process
2023-11-08 10:37:10 (68501) [INFO]      Scanning for motifs and matching to signals...
2023-11-08 10:37:10 (68501) [DEBUG]     Setting up writer queues
2023-11-08 10:37:10 (68501) [DEBUG]     Creating writer queue for ['AC0001DLXLHXHomeodomain_AC0001DLXLHXHomeodomain
', 'AC0002EMXPAXHomeodomain_AC0002EMXPAXHomeodomain', 
[snipped]
2023-11-08 10:37:10 (68501) [DEBUG]     Running with cores = 1
2023-11-08 10:37:10 (68501) [DEBUG]     Setting up scanner/bigwigs/fasta
2023-11-08 10:37:11 (68501) [DEBUG]     Scanning for motif occurrences
2023-11-08 10:37:11 (68501) [SPAM]      Processing region: ('chr1', 9995, 10469, '.', '.')
2023-11-08 10:37:11 (68501) [SPAM]      Random indices: [35, 25] for region length 474
Traceback (most recent call last):
  File "/home/research/vaquerizas/liz/grn/tobias/.tobias_venv/bin/TOBIAS", line 8, in <module>
2023-11-08 10:37:11 (68501) [ERROR]     ERROR IN REGION: chr1   9995    10469   .       0       .
    sys.exit(main())
  File "/home/research/vaquerizas/liz/grn/tobias/.tobias_venv/lib/python3.10/site-packages/tobias/TOBIAS.py", line 154, in main
    args.func(args)             
  File "/home/research/vaquerizas/liz/grn/tobias/.tobias_venv/li
b/python3.10/site-packages/tobias/tools/bindetect.py", line 342, in run_bindetect
    results.append(scan_and_score(chunk, motif_list, args, args.log_q, writer_qs))
  File "/home/research/vaquerizas/liz/grn/tobias/.tobias_venv/lib/python3.10/site-packages/tobias/tools/bindetect_functions.py", line 308, in scan_and_score
    raise Exception
Exception
2023-11-08 10:37:11 (74265) [ERROR]     Multiprocessing logger lost connection to queue - probably due to an error raised from a child process.

Hi @liz-is,

Is this still an issue? Technically TOBIAS should just count any regions without signal as 0, so I am not sure what is going on there. Can you share your version of TOBIAS and pyBigWig, e.g. with pip freeze | grep -e pyBigWig -e tobias, I will try to debug what the issue might be. Thank you!

I don't have a solution yet, but I just realised that one of the bigwig files used as input here is suspiciously small and also can't be loaded by IGV, although the logs don't suggest any issues with its creation. So I suspect that is the issue. I'll delete that file and try to re-run. It would maybe be helpful if the logs printed the condition that was being processed to make it easier to identify if a particular file is causing a problem?

Edit: forgot to include the versions, sorry, here they are:

pyBigWig==0.3.18
tobias==0.16.0

Hi @liz-is,

Ah I see, yes I guess it is an error arising from reading a corrupt bigwig file, so hopefully it is solved by the recalculation of the .bw. But I totally agree as well, the error is not really helpful - I will improve on that for the next version! So I will just keep this issue open for reference to myself 👍