boolean value of tensor with more than one value is ambiguous.

Question

boolean value of tensor with more than one value is ambiguous.

biscayan opened this issue 5 years ago · comments

Hyeongju Na commented 5 years ago

Which program causes the problem

Python prototype

Versions

Python version 3.7.7
Operating system ubuntu 18.04
Pytorch version 1.6.0

Issue
Hi, I read your paper, and I thought it is such a good algorithm. Thus, I want to apply the word beam search to my research.
However, it is not easy to implement with python project.
I have a research about speech recognition. My input data (speech -> spectrogram) enters into the model, and it makes the output which has a shape of [sequence length (T) x batch size (B) x number of characters (C)]. e.g. (371, 32, 29)
Then it is fed into the decoder.

def WordBeamSearch(mat, beamWidth, lm, useNGrams):
    "decode matrix, use given beam width and language model"
    chars = lm.getAllChars()
    blankIdx = len(chars)  # blank label is supposed to be last label in RNN output
    #mat = mat.cpu().numpy()
    print(mat.shape)
    maxT, _, _ = mat.shape  # shape of RNN output: TxBxC

    genesisBeam = Beam(lm, useNGrams)  # empty string
    last = BeamList()  # list of beams at time-step before beginning of RNN output
    last.addBeam(genesisBeam)  # start with genesis beam
    # go over all time-steps
    for t in range(maxT):
        curr = BeamList()  # list of beams at current time-step

        # go over best beams
        bestBeams = last.getBestBeams(beamWidth)  # get best beams
        .....

The error occurs when to get best beams
and error message 'boolean value of tensor with more than one value is ambiguous. ' is popped up
at here.

def getBestBeams(self, num):
"return best beams, specify the max. number of beams to be returned (beam width)"
        u = [v for (_, v) in self.beams.items()]
        lmWeight = 1
        return sorted(u, reverse=True, key=lambda x: x.getPrTotal() * (x.getPrTextual() ** lmWeight))[:num]

I changed the tensor into the numpy array, but it makes another error again.
'The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()'

I tried to find the solution and read your code for days, but I don't know what is problem.
Please help me. If you want to know more details about error, please let me know.
I look forward to your comment.
Thank you for your consideration.

Harald Scheidl · Answer 1 · Mon Oct 05 2020 15:33:57 GMT+0800 (China Standard Time)

Hi,

can you give the function-stack-trace which Python prints when it crashes (just copy paste the output from the terminal)? I have to know the exact line where this happens.
I never tried it with PyTorch Tensors, but for NumPy arrays it should work
You're using the Python prototype. Is there a reason why you don't use the C++ implementation (which can also be used in Python code)? It is much faster and also provides more features.
Is the CTC blank character the last one of the characters?

Hyeongju Na · Answer 2 · Tue Oct 06 2020 09:07:47 GMT+0800 (China Standard Time)

@githubharald
Thank you for your comment. I solve the problem by converting tensors into numpy arrays and adding .all() function to the array.

However, I'm curious about the output of the decoder.

As I printed the output, decoder makes just one sentence, but can I have output which is bound with batch size?
For example, decoder makes a list which has a length of batch size, so I can get sentences at once.

Second, does decoder output a sentence which is only in the language model (corpus)?
I made 'chars.txt' and 'wordchars.txt' with 28 characters which are space, ' and A-Z
and made 'corpus.txt' with some sentences.
It seems that decoder outputs a sentence which is only in the 'corpus.txt'.

Thanks in advance.

Harald Scheidl · Answer 3 · Wed Oct 07 2020 17:58:05 GMT+0800 (China Standard Time)

The prototype only works on one batch element at a time. As I said - better use the C++ implementation.
To use the language model you need a large corpus. You're only using a small corpus, the best is to disable the language model and just use the corpus to create a dictionary, which is called "Words" mode, which is enabled by setting useNGrams = False in main.py.

Hyeongju Na · Answer 4 · Wed Oct 07 2020 21:28:46 GMT+0800 (China Standard Time)

@githubharald
OK, I understand. Thank you for your explanation.
I will try to use C++ implementation.
If I have another question, I will open issue again.
Thank you.