Rongjiehuang / Multi-Singer

PyTorch Implementation of Multi-Singer (ACM-MM'21)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Missing Code?

Coice opened this issue · comments

commented

Hello,

Generator1 requires two parameters for forward (x and c) but in the training step only the mel features are used. (No noise, or other features.)

Is this correct?

    def _train_step(self, batch):
        """Train model one step."""
        # parse batch
        x = []

        x.append(batch['feats'])
        embed = batch['embed'].to(self.device)

        y = batch['audios'].to(self.device)
        x = tuple([x_.to(self.device) for x_ in x])
        y_ = self.model["generator"](*x).to(self.device)

Thank you for your time.

Hi,
Yes, I forgot to add one line when cleaning up the code, and it should be:

 def _train_step(self, batch):
        """Train model one step."""
        # parse batch
        x = []
        x.append(batch['noise'])
        x.append(batch['feats'])

commented

@SunMail-hub thanks for your response.

The eval code seems to have the correct logic, but turning on F0 features or chroma would cause an error:

        """Evaluate model one step."""
        # parse batch
        x = []

        if self.config['use_noise_input']:
            x.append(batch['noise'])
        if self.config['use_f0']:
            x.append(batch['f0_origins'])
        if self.config['use_chroma']:
            x.append(batch['chromas'])
        x.append(batch['feats'])
        y = batch['audios'].to(self.device)
        x = tuple([x_.to(self.device) for x_ in x])
        embed = batch['embed'].to(self.device)
        y_ = self.model["generator"](*x).to(self.device)

Were you concatenating the extra features (f0, chromas, etc) to the mel features to make it one vector for the c parameter or was there some other modification to the Generator1?

Again thanks for your time.

Hi @Coice, you could see the settings in config file as follows:

use_f0: false
use_chroma: false
use_noise_input: true

Therefore we use c and noise as model input without extra features.