AIGC-Audio / AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can run in cpu????

FyhSky opened this issue · comments

Initializing AudioGPT
Initializing Make-An-Audio to cpu
LatentDiffusion_audio: Running in eps-prediction mode
DiffusionWrapper has 160.22 M params.
making attention of type 'vanilla' with 256 in_channels
making attention of type 'vanilla' with 256 in_channels
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 106, 106) = 44944 dimensions.
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 256 in_channels
making attention of type 'vanilla' with 256 in_channels
making attention of type 'vanilla' with 256 in_channels
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight']

This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
TextEncoder comes with 111.32 M params.
Traceback (most recent call last):
File "", line 1378, in
bot = ConversationBot()
File "", line 1057, in init
self.t2a = T2A(device="cpu")
File "", line 144, in init
self.sampler = self._initialize_model('text_to_audio/Make_An_Audio/configs/text_to_audio/txt2audio_args.yaml', 'text_to_audio/Make_An_Audio/useful_ckpts/ta40multi_epoch=000085.ckpt', device=device)
File "", line 150, in _initialize_model
model.load_state_dict(torch.load(ckpt, map_location='cpu')["state_dict"], strict=False)
File "/root/anaconda3/envs/audiogpt/lib/python3.8/site-packages/torch/", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/root/anaconda3/envs/audiogpt/lib/python3.8/site-packages/torch/", line 920, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.
