Questions about processor

Question

Questions about processor

ahmedlone127 opened this issue 3 years ago · comments

what does this code do :

def _normalize(self, x):
        """You must call this before padding."""
        # -> (1, seqlen)
        mean = tf.reduce_mean(x, axis=-1, keepdims=True)
        var = tf.math.reduce_variance(x, axis=-1, keepdims=True)
        return tf.squeeze((x - mean) / tf.sqrt(var + 1e-5))

my other question is on what basis are numbers assigned to the vocab list by that i mean this :

I understand the code in the picture it basically gets all the characters from the text but my question is when it turns the characters into a dictionary with the values as their index does it matter what character is at what index and if yes then how does the right character get at the right index. I was trying to test my version of your tokenizer and I had trouble producing the right outputs with your vocab.json so I went and took the one here which worked fine.Also i was using a fine-tuned model for making predictions which was associated with this tokenizer via hugging face

Vasudev Gupta · Answer 1 · Wed Aug 18 2021 21:28:10 GMT+0800 (China Standard Time)

Hi @ahmedlone127,

Thanks for your interest in this project!!

def _normalize(self, x):
        """You must call this before padding."""
        # -> (1, seqlen)
        mean = tf.reduce_mean(x, axis=-1, keepdims=True)
        var = tf.math.reduce_variance(x, axis=-1, keepdims=True)
        return tf.squeeze((x - mean) / tf.sqrt(var + 1e-5))

Wav2Vec2 was trained after normalising speech along time axis. So this code is allowing that functionality. In my repository, Wav2Vec2Processor has 2 different functionality- one handles preprocessing of speech (when is_tokenizer=False) & other handles post processing of model outputs (i.e decoding logits into string) (when is_tokenizer=True). So, above code is relevant to instance created by setting is_tokenizer=False. You can refer this notebook for better understanding.

my other question is on what basis are numbers assigned to the vocab list by that i mean this :

This vocabulary file is getting used (https://github.com/vasudevgupta7/gsoc-wav2vec2/blob/main/data/vocab.json) for de-tokenizing. This file has been taken from pre-trained Wav2Vec2 model directly.

Hoping this would help!!

ahmedlone127 · Answer 2 · Thu Aug 19 2021 04:45:17 GMT+0800 (China Standard Time)

hey thanks for the answer I just ran the notebook you attached and looks like some of the stuff needs to be updated

Vasudev Gupta · Answer 3 · Thu Aug 19 2021 08:50:16 GMT+0800 (China Standard Time)

I just fixed it now. Can you try running that notebook again?

ahmedlone127 · Answer 4 · Thu Aug 19 2021 14:24:53 GMT+0800 (China Standard Time)

yeah looks good ! thanks , also why do you specify axis =-1 and keepdims = True

I was trying to duplicate this to scala and this is what i got uptill now :

  def mean(list:List[Double]):Double = if(list.isEmpty) 0 else list.sum/list.size
  def variance(xs: Seq[Double]): Option[Double] = {
    mean(xs).flatMap(m => mean(xs.map(x => Math.pow(x-m, 2))))
  }

it's for the first two lines , do they look good to you I am anxious casue i don't understand what keepdims= True and axis =-1 mean casue i am probably not adding their functionality inside this function

Vasudev Gupta · Answer 5 · Fri Aug 20 2021 13:59:33 GMT+0800 (China Standard Time)

I am axis=-1 to make sure normalization is happening along time dimension. keepdims=True will help us keep the nD array as output if input is nD array.

I would encourage you to print out outputs of these statements to understand them better. Since, I am not familiar with scala, I am not sure if your code is correct or wrong.

ahmedlone127 · Answer 6 · Fri Aug 20 2021 14:35:44 GMT+0800 (China Standard Time)

okay thanks !

ahmedlone127 · Answer 7 · Sat Aug 21 2021 03:25:53 GMT+0800 (China Standard Time)

okay so i am pretty much done with verifying the outputs even though i couldn't implent axis=-1 it looked identical with alot more precision , I want to ask why do we call tf.transpose here even though the output after and before calling it is pretty much the same

Vasudev Gupta · Answer 8 · Tue Aug 24 2021 00:38:15 GMT+0800 (China Standard Time)

Hey, sorry for late reply. You can avoid tf.transpose if everything looking alright without it.

Vasudev Gupta · Answer 9 · Tue Aug 24 2021 00:39:04 GMT+0800 (China Standard Time)

Closing this issue as everything is resolved. Please create a new issue in case you wanna discuss something.