ValueError when trying to do a PCA
himaniyadav opened this issue · comments
My code is up to date on the main branch. I've added a requirements.txt
file for ease of reproducibility.
I've gotten the VAE to train. Just to be able to visualize the resulting latent vectors, I've been trying to run a simple PCA with 2 dimensions to be able to visualize the latent space -- I'm definitely not expecting the results to be particularly good, but would just like to see what it looks like at this stage. The resulting latent vectors have 256 dimensions and are of type torch.Tensor. I'm just grabbing the latent vector for each batch after it's trained. So into the PCA I'm passing in a list of size 114 (# batches) with each item being having shape [1, 256] (batch size of 1, each vector 256 dimensions). I keep getting the following error: ValueError: only one element tensors can be converted to Python scalars
.
There might be a simple solution but a few online searches didn't yield any helpful results. I feel like it could be something obvious I'm missing but I couldn't exactly find what would be causing the error. Would appreciate any help if something stands out!
Command:
python src/main.py
Full stack trace:
Traceback (most recent call last):
File "main.py", line 49, in <module>
pca_embedding = PCA(n_components=2).fit_transform(big_mus)
File "/home/himani/anaconda3/lib/python3.7/site-packages/sklearn/decomposition/_pca.py", line 376, in fit_transform
U, S, V = self._fit(X)
File "/home/himani/anaconda3/lib/python3.7/site-packages/sklearn/decomposition/_pca.py", line 398, in _fit
ensure_2d=True, copy=self.copy)
File "/home/himani/anaconda3/lib/python3.7/site-packages/sklearn/base.py", line 420, in _validate_data
X = check_array(X, **check_params)
File "/home/himani/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
return f(**kwargs)
File "/home/himani/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 598, in check_array
array = np.asarray(array, order=order, dtype=dtype)
File "/home/himani/anaconda3/lib/python3.7/site-packages/numpy/core/_asarray.py", line 83, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: only one element tensors can be converted to Python scalars
Try to provide some other diagnostics if you can -- e.g. the observed vs expected shape of big_mus
when being fed to the transform, and any other things you think may be relevant. Initial guess would be that big_mus
is of the wrong shape.
"So into the PCA I'm passing in a list of size 114 (# batches) with each item being having shape [1, 256] (batch size of 1, each vector 256 dimensions)"
Is this list that you're passing in the same as big_mus
? I'm going to assume yes for the rest of this comment. Then, the expected shape is [114,1,256]
correct? I believe per sklearn's API, the transform data should be a second-order list, e.g. [114,256]
. And make sure to convert from torch tensors to numpy arrays (you can use the tensor .numpy()
method, amongst other options).
Implement both those changes and let me know if the problem persists. If it does, try to provide more contextual debugging info if possible
Passed in a new numpy array of shape (114, 256) and it works now, thank you!