deeptools / pyBigWig

A python extension for quick access to bigWig and bigBed files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"Empty" values in "manually" constructed bigwig file.

blaiseli opened this issue · comments

I use pyBigWig to generate a mean of several bigwig files. This worked using Python 3.6 and an earlier version of pyBigWig (0.3.12), but doesn't any more, using Python 3.7 with a freshly installed PyBigWig from pip (0.3.17): I end up with empty values.

Here is the code (part of a snakemake rule, chrom_sizes is a dict giving the sizes of the chromosomes):

    run:
        with warn_context(log.warnings) as warn:
            try:
                bws = [pyBigWig.open(bw_filename) for bw_filename in input]
            except RuntimeError as e:
                warn(str(e))
                warn("Generating empty file.\n")
                # Make the file empty
                open(output.bw, "w").close()
            else:
                bw_out = pyBigWig.open(output.bw, "w")
                bw_out.addHeader(list(chrom_sizes.items()))
                for (chrom, chrom_len) in chrom_sizes.items():
                    try:
                        assert all([bw.chroms()[chrom] == chrom_len for bw in bws])
                    except KeyError as e:
                        warn(str(e))
                        warn(f"Chromosome {chrom} might be missing from one of the input files.\n")
                        for filename, bw in zip(input, bws):
                            msg = " ".join([f"{filename}:", *list(bw.chroms().keys())])
                            warn(f"{msg}:\n")
                        #raise
                        warn(f"The bigwig files without {chrom} will be skipped.\n")
                    to_use = [bw for bw in bws if chrom in bw.chroms()]
                    if to_use:
                        means = np.nanmean(np.vstack([bw.values(chrom, 0, chrom_len) for bw in to_use]), axis=0)
                    else:
                        means = np.zeros(chrom_len)
                    # bin size is 10
                    bw_out.addEntries(chrom, 0, values=np.nan_to_num(means[0::10]), span=10, step=10)
                bw_out.close()
                for bw in bws:
                    bw.close()

I don't know whether this is related: For some forgotten reason, I had code using numpy=True in the values method, but using the more recent python and pybigwig, I had to remove it because it was not a valid argument any more.

I just tested, running the code "manually" in an interactive python interpreter using the same input bigwig files, with 0.3.12 on Python 3.6 and 0.3.17 on Python 3.7, and I confirm my code works as expected in the former case, not in the latter.

Everything seems fine up to an including np.nan_to_num(means[0::10]), which is not empty.
The problem is when I load the resulting bigwig file and look at values. This results in an empty list (and the bigwig file has just a few hundred bytes in size instead of a few MB).

Can you provide any bigWig files that I can use to reproduce this?

Regarding the numpy=True option, that will only be available if numpy was installed at the time you installed pyBigWig.

Regarding the numpy=True option, that will only be available if numpy was installed at the time you installed pyBigWig.

Thanks for the tip. I re-built byBigWig after installing numpy, and restored the numpy=True option, and now the generated bigwig files look normal again.

Is it expected that using numpy arrays as values when creating entries in a pyBigWig not having numpy support results in "empty" bigwigs? If so, maybe some warnings would be useful.

I don't use numpy arrays when making bigWig files in deepTools (no issues with empty bigWig files there), so this is likely an extremely obscure artifact of how your code worked ("extremely obscure" since I too have no idea why that's not working as expected!).