airline passengers and Mauna Loa examples (at least) not running

Question

airline passengers and Mauna Loa examples (at least) not running

rowlesmr opened this issue 2 years ago · comments

Hi all

I've just installed MOGPTK and am unable to run (at least) the first two examples (airline passengers and Mauna Loa).

I'm currently running in a jupyter notebook, and the commands:

# Mauna Loa
method = 'BNSE'
model.init_parameters(method)
model.plot_spectrum(title=f'PSD with {method} initialization');

gives me the following stacktrace

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[5], line 2
      1 method = 'BNSE'
----> 2 model.init_parameters(method)
      3 model.plot_spectrum(title=f'PSD with {method} initialization');

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\models\sm.py:103, in SM.init_parameters(self, method, iters)
    101         return
    102 elif method.lower() == 'bnse':
--> 103     amplitudes, means, variances = self.dataset.get_bnse_estimation(self.Q, iters=iters)
    104     if np.sum(amplitudes) == 0.0:
    105         logger.warning('BNSE could not find peaks for SM')

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\dataset.py:605, in DataSet.get_bnse_estimation(self, Q, n, iters)
    603 variances = []
    604 for channel in self.channels:
--> 605     channel_amplitudes, channel_means, channel_variances = channel.get_bnse_estimation(Q, n, iters=iters)
    606     amplitudes.append(channel_amplitudes)
    607     means.append(channel_means)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\data.py:997, in Data.get_bnse_estimation(self, Q, n, iters)
    995     y_err = (y_err_upper-y_err_lower)/2.0 # TODO: strictly incorrect: takes average error after transformation
    996 for i in range(input_dims):
--> 997     w, psd, _ = BNSE(x[:,i], y, y_err=y_err, max_freq=nyquist[i], n=n, iters=iters)
    998     # TODO: why? emperically found
    999     psd /= (np.max(x[:,i])-np.min(x[:,i]))**2

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\init.py:53, in BNSE(x, y, y_err, max_freq, n, iters)
     51 optimizer = torch.optim.Adam(model.parameters(), lr=2.0)
     52 for i in range(iters):
---> 53     optimizer.step(model.loss)
     55 alpha = float(0.5/x_range**2)
     56 w = torch.linspace(0.0, max_freq, n, device=gpr.config.device, dtype=gpr.config.dtype).reshape(-1,1)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\optim\optimizer.py:280, in Optimizer.profile_hook_step.<locals>.wrapper(*args, **kwargs)
    276         else:
    277             raise RuntimeError(f"{func} must return None or a tuple of (new_args, new_kwargs),"
    278                                f"but got {result}.")
--> 280 out = func(*args, **kwargs)
    281 self._optimizer_step_code()
    283 # call optimizer step post hooks

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\optim\optimizer.py:33, in _use_grad_for_differentiable.<locals>._use_grad(self, *args, **kwargs)
     31 try:
     32     torch.set_grad_enabled(self.defaults['differentiable'])
---> 33     ret = func(self, *args, **kwargs)
     34 finally:
     35     torch.set_grad_enabled(prev_grad)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\optim\adam.py:121, in Adam.step(self, closure)
    119 if closure is not None:
    120     with torch.enable_grad():
--> 121         loss = closure()
    123 for group in self.param_groups:
    124     params_with_grad = []

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\mogptk\gpr\model.py:311, in Model.loss(self)
    309 self.zero_grad()
    310 loss = -self.log_marginal_likelihood() - self.log_prior()
--> 311 loss.backward()
    312 return loss

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\_tensor.py:487, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
    477 if has_torch_function_unary(self):
    478     return handle_torch_function(
    479         Tensor.backward,
    480         (self,),
   (...)
    485         inputs=inputs,
    486     )
--> 487 torch.autograd.backward(
    488     self, gradient, retain_graph, create_graph, inputs=inputs
    489 )

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\autograd\__init__.py:200, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    195     retain_graph = create_graph
    197 # The reason we repeat same the comment below is that
    198 # some Python versions print out the first line of a multi-line function
    199 # calls in the traceback and some print out the last line
--> 200 Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
    201     tensors, grad_tensors_, retain_graph, create_graph, inputs,
    202     allow_unreachable=True, accumulate_grad=True)

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.DoubleTensor [200]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Any ideas on what's going on?

python --version
Python 3.11.2

pip freeze:
aiofiles==22.1.0
aiosqlite==0.18.0
anyio==3.6.2
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.2.1
attrs==22.2.0
Babel==2.12.1
backcall==0.2.0
beautifulsoup4==4.11.2
bleach==6.0.0
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.1.0
colorama==0.4.6
comm==0.1.2
contourpy==1.0.7
cycler==0.11.0
debugpy==1.6.6
decorator==5.1.1
defusedxml==0.7.1
executing==1.2.0
fastjsonschema==2.16.3
filelock==3.10.0
fonttools==4.38.0
fqdn==1.5.1
idna==3.4
ipykernel==6.21.3
ipython==8.11.0
ipython-genutils==0.2.0
isoduration==20.11.0
jedi==0.18.2
Jinja2==3.1.2
joblib==1.2.0
json5==0.9.11
jsonpointer==2.3
jsonschema==4.17.3
jupyter-events==0.6.3
jupyter-ydoc==0.2.3
jupyter_client==8.0.3
jupyter_core==5.3.0
jupyter_server==2.5.0
jupyter_server_fileid==0.8.0
jupyter_server_terminals==0.4.4
jupyter_server_ydoc==0.6.1
jupyterlab==3.6.1
jupyterlab-pygments==0.2.2
jupyterlab_server==2.20.0
kiwisolver==1.4.4
MarkupSafe==2.1.2
matplotlib==3.7.0
matplotlib-inline==0.1.6
mistune==2.0.5
mogptk==0.3.2
mplcursors==0.5.2
mpmath==1.3.0
nbclassic==0.5.3
nbclient==0.7.2
nbconvert==7.2.10
nbformat==5.7.3
nest-asyncio==1.5.6
networkx==3.0
notebook==6.5.3
notebook_shim==0.2.2
numpy==1.24.2
packaging==23.0
pandas==1.5.3
pandocfilters==1.5.0
parso==0.8.3
pdCIFplotter==0.1.3
pickleshare==0.7.5
Pillow==9.4.0
platformdirs==3.1.1
prometheus-client==0.16.0
prompt-toolkit==3.0.38
protobuf==4.21.12
psutil==5.9.4
pure-eval==0.2.2
PyCifRW==4.4.3
pycparser==2.21
Pygments==2.14.0
pyparsing==3.0.9
pyrsistent==0.19.3
PySimpleGUI==4.60.4
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2022.7.1
pywin32==305
pywinpty==2.0.10
PyYAML==6.0
pyzmq==25.0.1
requests==2.28.2
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
Rx==3.2.0
scikit-learn==1.2.2
scipy==1.10.1
Send2Trash==1.8.0
six==1.16.0
sklearn==0.0.post1
sniffio==1.3.0
soupsieve==2.4
stack-data==0.6.2
sympy==1.11.1
terminado==0.17.1
threadpoolctl==3.1.0
tinycss2==1.2.1
torch==2.0.0
tornado==6.2
traitlets==5.9.0
typing_extensions==4.5.0
uri-template==1.2.0
urllib3==1.26.15
wcwidth==0.2.6
webcolors==1.12
webencodings==0.5.1
websocket-client==1.5.1
y-py==0.5.9
ypy-websocket==0.8.2
zaber-motion==3.1.1
zaber-motion-bindings-windows==3.1.1

Taco de Wolff · Answer 1 · Wed Mar 22 2023 09:31:17 GMT+0800 (China Standard Time)

Thanks for raising this issue! The code works fine for torch at v1, but fails for me with the same error message for torch at v2. Working on a fix, I'll keep you posted.

Taco de Wolff · Answer 2 · Wed Mar 22 2023 10:25:18 GMT+0800 (China Standard Time)

I seem to have found the bug, can you please check?

Matthew Rowles · Answer 3 · Wed Mar 22 2023 11:28:24 GMT+0800 (China Standard Time)

I was able to cut and paste the example in a jupyter notebook and it all ran and looked like the examples. Sooo, yes... ?

.

I made a fresh virtual env, checked out the latest master, activated the venv, installed mogptk, and went from there.

I also just noticed that the EEG example in the docs has a non-positive definite error. I'll add an issue for that.