boolean value of tensor with more than one value is ambiguous.
biscayan opened this issue · comments
- Which program causes the problem
- Python prototype
- Versions
- Python version 3.7.7
- Operating system ubuntu 18.04
- Pytorch version 1.6.0
- Issue
Hi, I read your paper, and I thought it is such a good algorithm. Thus, I want to apply the word beam search to my research.
However, it is not easy to implement with python project.
I have a research about speech recognition. My input data (speech -> spectrogram) enters into the model, and it makes the output which has a shape of [sequence length (T) x batch size (B) x number of characters (C)]. e.g. (371, 32, 29)
Then it is fed into the decoder.
def WordBeamSearch(mat, beamWidth, lm, useNGrams):
"decode matrix, use given beam width and language model"
chars = lm.getAllChars()
blankIdx = len(chars) # blank label is supposed to be last label in RNN output
#mat = mat.cpu().numpy()
print(mat.shape)
maxT, _, _ = mat.shape # shape of RNN output: TxBxC
genesisBeam = Beam(lm, useNGrams) # empty string
last = BeamList() # list of beams at time-step before beginning of RNN output
last.addBeam(genesisBeam) # start with genesis beam
# go over all time-steps
for t in range(maxT):
curr = BeamList() # list of beams at current time-step
# go over best beams
bestBeams = last.getBestBeams(beamWidth) # get best beams
.....
The error occurs when to get best beams
and error message 'boolean value of tensor with more than one value is ambiguous. ' is popped up
at here.
def getBestBeams(self, num):
"return best beams, specify the max. number of beams to be returned (beam width)"
u = [v for (_, v) in self.beams.items()]
lmWeight = 1
return sorted(u, reverse=True, key=lambda x: x.getPrTotal() * (x.getPrTextual() ** lmWeight))[:num]
I changed the tensor into the numpy array, but it makes another error again.
'The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()'
I tried to find the solution and read your code for days, but I don't know what is problem.
Please help me. If you want to know more details about error, please let me know.
I look forward to your comment.
Thank you for your consideration.
Hi,
- can you give the function-stack-trace which Python prints when it crashes (just copy paste the output from the terminal)? I have to know the exact line where this happens.
- I never tried it with PyTorch Tensors, but for NumPy arrays it should work
- You're using the Python prototype. Is there a reason why you don't use the C++ implementation (which can also be used in Python code)? It is much faster and also provides more features.
- Is the CTC blank character the last one of the characters?
@githubharald
Thank you for your comment. I solve the problem by converting tensors into numpy arrays and adding .all() function to the array.
However, I'm curious about the output of the decoder.
As I printed the output, decoder makes just one sentence, but can I have output which is bound with batch size?
For example, decoder makes a list which has a length of batch size, so I can get sentences at once.
Second, does decoder output a sentence which is only in the language model (corpus)?
I made 'chars.txt' and 'wordchars.txt' with 28 characters which are space, ' and A-Z
and made 'corpus.txt' with some sentences.
It seems that decoder outputs a sentence which is only in the 'corpus.txt'.
Thanks in advance.
- The prototype only works on one batch element at a time. As I said - better use the C++ implementation.
- To use the language model you need a large corpus. You're only using a small corpus, the best is to disable the language model and just use the corpus to create a dictionary, which is called "Words" mode, which is enabled by setting
useNGrams = False
in main.py.
@githubharald
OK, I understand. Thank you for your explanation.
I will try to use C++ implementation.
If I have another question, I will open issue again.
Thank you.