jerryji1993 / DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Home Page:https://doi.org/10.1093/bioinformatics/btab083

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AssertionError in kmer2seq for motif search

Vejni opened this issue · comments

Maybe I do not understand what kmer2seq in motif_utils.py wants to do, but currently it is only concatenating the first bases of each kmer, plus the last kmer as it is, triggering the AssertionError in the function. If anyone is looking at this, changing the function (lazily) to this worked for me:

def kmer2seq(kmers):
    """
    Convert kmers to original sequence
    
    Arguments:
    kmers -- str, kmers separated by space.
    
    Returns:
    seq -- str, original sequence.

    """
    kmers_list = kmers.split(" ")
    seq = "".join(kmers_list)
    return seq