Problems with [torch-rb-0.5.3]/lib/torch/utils/data/data_loader.rb
lfuszara1 opened this issue · comments
I found some problems with data_loader.rb.
First problem is default_convert, when datatype is Float library rise NotImplementedYet error, I'm analyzing audio data and in array are floats.
Second problem is with each -> indexes.each_slice(@batch_size) do |idx|; yield is not working, I set batch_size to 1.
Third problem is with size method, when I defined my custom dataset (The LJ Speech Dataset), I had to add size function to class.
That not works:
@dataloader = Torch::Utils::Data::DataLoader.new(@dataset, batch_size: 1).each
for
require 'csv'
require './lib/ruby/tts/utils'
class PrepareDataset
def initialize(csv_file, root_dir)
@config = YAML::load(File.open('./lib/ruby/tts/config.yml'))
@utils = Utils.new
@landmarks_frame = CSV.read(csv_file, liberal_parsing: true, col_sep: "|")
@root_dir = root_dir
end
def size
@landmarks_frame.length
end
alias_method :length, :size
alias_method :count, :size
def [](i)
wav_name = @root_dir + '/' + @landmarks_frame[i][0] + ".wav"
mel, mag = @utils.get_spectrograms(wav_name)
mel_dump = Marshal.dump(mel)
mag_dump = Marshal.dump(mag)
File.open(wav_name[0...-4] + '.pt', 'wb') { |file| file.write(mel_dump) }
File.open(wav_name[0...-4] + '.mag', 'wb') { |file| file.write(mag_dump) }
[mel, mag]
end
end
without patching data_loader.rb, but I run that to generate dataset:
@dataset = PrepareDataset.new(@config['data_path'] + '/metadata.csv', @config['data_path'] + 'wavs')
@dataset.size.times { |t| @dataset[t] }
I'm porting https://github.com/soobinseo/Transformer-TTS to Ruby.
Hey @lfuszara1, thanks for reporting. Can you create a minimal example script that reproduces the issues with the equivalent Python code?