GAMES-UChile / mogptk

Multi-Output Gaussian Process Toolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

airline passengers and Mauna Loa examples (at least) not running

rowlesmr opened this issue · comments

Hi all

I've just installed MOGPTK and am unable to run (at least) the first two examples (airline passengers and Mauna Loa).

I'm currently running in a jupyter notebook, and the commands:

# Mauna Loa
method = 'BNSE'
model.init_parameters(method)
model.plot_spectrum(title=f'PSD with {method} initialization');

gives me the following stacktrace

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[5], line 2
      1 method = 'BNSE'
----> 2 model.init_parameters(method)
      3 model.plot_spectrum(title=f'PSD with {method} initialization');

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\models\sm.py:103, in SM.init_parameters(self, method, iters)
    101         return
    102 elif method.lower() == 'bnse':
--> 103     amplitudes, means, variances = self.dataset.get_bnse_estimation(self.Q, iters=iters)
    104     if np.sum(amplitudes) == 0.0:
    105         logger.warning('BNSE could not find peaks for SM')

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\dataset.py:605, in DataSet.get_bnse_estimation(self, Q, n, iters)
    603 variances = []
    604 for channel in self.channels:
--> 605     channel_amplitudes, channel_means, channel_variances = channel.get_bnse_estimation(Q, n, iters=iters)
    606     amplitudes.append(channel_amplitudes)
    607     means.append(channel_means)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\data.py:997, in Data.get_bnse_estimation(self, Q, n, iters)
    995     y_err = (y_err_upper-y_err_lower)/2.0 # TODO: strictly incorrect: takes average error after transformation
    996 for i in range(input_dims):
--> 997     w, psd, _ = BNSE(x[:,i], y, y_err=y_err, max_freq=nyquist[i], n=n, iters=iters)
    998     # TODO: why? emperically found
    999     psd /= (np.max(x[:,i])-np.min(x[:,i]))**2

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\init.py:53, in BNSE(x, y, y_err, max_freq, n, iters)
     51 optimizer = torch.optim.Adam(model.parameters(), lr=2.0)
     52 for i in range(iters):
---> 53     optimizer.step(model.loss)
     55 alpha = float(0.5/x_range**2)
     56 w = torch.linspace(0.0, max_freq, n, device=gpr.config.device, dtype=gpr.config.dtype).reshape(-1,1)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\optim\optimizer.py:280, in Optimizer.profile_hook_step.<locals>.wrapper(*args, **kwargs)
    276         else:
    277             raise RuntimeError(f"{func} must return None or a tuple of (new_args, new_kwargs),"
    278                                f"but got {result}.")
--> 280 out = func(*args, **kwargs)
    281 self._optimizer_step_code()
    283 # call optimizer step post hooks

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\optim\optimizer.py:33, in _use_grad_for_differentiable.<locals>._use_grad(self, *args, **kwargs)
     31 try:
     32     torch.set_grad_enabled(self.defaults['differentiable'])
---> 33     ret = func(self, *args, **kwargs)
     34 finally:
     35     torch.set_grad_enabled(prev_grad)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\optim\adam.py:121, in Adam.step(self, closure)
    119 if closure is not None:
    120     with torch.enable_grad():
--> 121         loss = closure()
    123 for group in self.param_groups:
    124     params_with_grad = []

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\gpr\model.py:311, in Model.loss(self)
    309 self.zero_grad()
    310 loss = -self.log_marginal_likelihood() - self.log_prior()
--> 311 loss.backward()
    312 return loss

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\_tensor.py:487, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
    477 if has_torch_function_unary(self):
    478     return handle_torch_function(
    479         Tensor.backward,
    480         (self,),
   (...)
    485         inputs=inputs,
    486     )
--> 487 torch.autograd.backward(
    488     self, gradient, retain_graph, create_graph, inputs=inputs
    489 )

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\autograd\__init__.py:200, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    195     retain_graph = create_graph
    197 # The reason we repeat same the comment below is that
    198 # some Python versions print out the first line of a multi-line function
    199 # calls in the traceback and some print out the last line
--> 200 Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
    201     tensors, grad_tensors_, retain_graph, create_graph, inputs,
    202     allow_unreachable=True, accumulate_grad=True)

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.DoubleTensor [200]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Any ideas on what's going on?

python --version
Python 3.11.2

pip freeze:
aiofiles==22.1.0
aiosqlite==0.18.0
anyio==3.6.2
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.2.1
attrs==22.2.0
Babel==2.12.1
backcall==0.2.0
beautifulsoup4==4.11.2
bleach==6.0.0
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.1.0
colorama==0.4.6
comm==0.1.2
contourpy==1.0.7
cycler==0.11.0
debugpy==1.6.6
decorator==5.1.1
defusedxml==0.7.1
executing==1.2.0
fastjsonschema==2.16.3
filelock==3.10.0
fonttools==4.38.0
fqdn==1.5.1
idna==3.4
ipykernel==6.21.3
ipython==8.11.0
ipython-genutils==0.2.0
isoduration==20.11.0
jedi==0.18.2
Jinja2==3.1.2
joblib==1.2.0
json5==0.9.11
jsonpointer==2.3
jsonschema==4.17.3
jupyter-events==0.6.3
jupyter-ydoc==0.2.3
jupyter_client==8.0.3
jupyter_core==5.3.0
jupyter_server==2.5.0
jupyter_server_fileid==0.8.0
jupyter_server_terminals==0.4.4
jupyter_server_ydoc==0.6.1
jupyterlab==3.6.1
jupyterlab-pygments==0.2.2
jupyterlab_server==2.20.0
kiwisolver==1.4.4
MarkupSafe==2.1.2
matplotlib==3.7.0
matplotlib-inline==0.1.6
mistune==2.0.5
mogptk==0.3.2
mplcursors==0.5.2
mpmath==1.3.0
nbclassic==0.5.3
nbclient==0.7.2
nbconvert==7.2.10
nbformat==5.7.3
nest-asyncio==1.5.6
networkx==3.0
notebook==6.5.3
notebook_shim==0.2.2
numpy==1.24.2
packaging==23.0
pandas==1.5.3
pandocfilters==1.5.0
parso==0.8.3
pdCIFplotter==0.1.3
pickleshare==0.7.5
Pillow==9.4.0
platformdirs==3.1.1
prometheus-client==0.16.0
prompt-toolkit==3.0.38
protobuf==4.21.12
psutil==5.9.4
pure-eval==0.2.2
PyCifRW==4.4.3
pycparser==2.21
Pygments==2.14.0
pyparsing==3.0.9
pyrsistent==0.19.3
PySimpleGUI==4.60.4
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2022.7.1
pywin32==305
pywinpty==2.0.10
PyYAML==6.0
pyzmq==25.0.1
requests==2.28.2
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
Rx==3.2.0
scikit-learn==1.2.2
scipy==1.10.1
Send2Trash==1.8.0
six==1.16.0
sklearn==0.0.post1
sniffio==1.3.0
soupsieve==2.4
stack-data==0.6.2
sympy==1.11.1
terminado==0.17.1
threadpoolctl==3.1.0
tinycss2==1.2.1
torch==2.0.0
tornado==6.2
traitlets==5.9.0
typing_extensions==4.5.0
uri-template==1.2.0
urllib3==1.26.15
wcwidth==0.2.6
webcolors==1.12
webencodings==0.5.1
websocket-client==1.5.1
y-py==0.5.9
ypy-websocket==0.8.2
zaber-motion==3.1.1
zaber-motion-bindings-windows==3.1.1

Thanks for raising this issue! The code works fine for torch at v1, but fails for me with the same error message for torch at v2. Working on a fix, I'll keep you posted.

I seem to have found the bug, can you please check?

I was able to cut and paste the example in a jupyter notebook and it all ran and looked like the examples. Sooo, yes... ?

.

I made a fresh virtual env, checked out the latest master, activated the venv, installed mogptk, and went from there.

I also just noticed that the EEG example in the docs has a non-positive definite error. I'll add an issue for that.