[BUG] "sub" operator causing BoundsError
CharFox1 opened this issue · comments
Description:
When using the "sub" operator and the "plus" and "mult" operators together, an out of bounds error occurs. Not sure why or how these are related but the smallest error causing code I have found is below:
import numpy as np
from pysr import pysr, best, get_hof
# Dataset
X = 2*np.random.randn(100, 5)
y = X[:, 0] - X[:, 1]
# Learn equations
equations = pysr(X, y, niterations=5,
binary_operators=["sub", "plus", "mult"],
unary_operators=[])
...# (you can use ctl-c to exit early)
print(best())
# Log of tests
# plus, sub, mult = BoundsError (consistent)
# sub, mult = No Error
# plus, sub = No Error
# plus, mult, mySub = No Error (mySub(x, y) = x - y)
# plus, mult, negative = No Error (negative(x) = 0 - x)
# plus, mult = No Error (and success from (x0 + (-1.0 * x1)))
I am running Windows 10 and using VS Code and the PowerShell terminal in the IDE.
Python Version 3.9.0
Julia Version 1.6.2
The Error:
Running on julia -O3 C:\Users\ipunc\AppData\Local\Temp\tmp0ffjfong\runfile.jl
Activating environment at `C:\Python39\lib\site-packages\Project.toml`
Updating registry at `C:\Users\ipunc\.julia\registries\General`
Updating git-repo `https://github.com/JuliaRegistries/General.git`
No Changes to `C:\Python39\Lib\site-packages\Project.toml`
No Changes to `C:\Python39\Lib\site-packages\Manifest.toml`
Activating environment on workers.
From worker 3: Activating environment at `C:\Python39\Lib\site-packages\Project.toml`
From worker 4: Activating environment at `C:\Python39\Lib\site-packages\Project.toml`
From worker 5: Activating environment at `C:\Python39\Lib\site-packages\Project.toml`
From worker 2: Activating environment at `C:\Python39\Lib\site-packages\Project.toml`
Importing installed module on workers...Finished!
Testing module on workers...Finished!
Testing entire pipeline on workers...Finished!
Started!
1.0%┣▋ ┫ 1/100 [00:02<Inf:Inf, 0.0 it/s]Head worker occupation: 4.8%
Hall of Fame:
-----------------------------------------
Complexity Loss Score Equation
1 4.492e+00 7.282e-01 x0
3 1.263e-14 1.675e+01 (x0 - x1)
ERROR: LoadError: TaskFailedException
Stacktrace:
[1] wait
@ .\task.jl:322 [inlined]
[2] fetch
@ .\task.jl:337 [inlined]
[3] _EquationSearch(::SymbolicRegression...\ProgramConstants.jl.SRDistributed, datasets::Vector{SymbolicRegression...\Dataset.jl.Dataset{Float32}}; niterations::Int64, options::Options{Tuple{typeof(-), typeof(+), typeof(*)}, Tuple{}, L2DistLoss}, numprocs::Int64, procs::Nothing, runtests::Bool)
@ SymbolicRegression C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\SymbolicRegression.jl:387
[4] EquationSearch(datasets::Vector{SymbolicRegression...\Dataset.jl.Dataset{Float32}}; niterations::Int64, options::Options{Tuple{typeof(-), typeof(+), typeof(*)}, Tuple{}, L2DistLoss}, numprocs::Int64, procs::Nothing, multithreading::Bool, runtests::Bool)
@ SymbolicRegression C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\SymbolicRegression.jl:181
[5] EquationSearch(X::Matrix{Float32}, y::Matrix{Float32}; niterations::Int64, weights::Nothing,
varMap::Vector{String}, options::Options{Tuple{typeof(-), typeof(+), typeof(*)}, Tuple{}, L2DistLoss}, numprocs::Int64, procs::Nothing, multithreading::Bool, runtests::Bool)
@ SymbolicRegression C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\SymbolicRegression.jl:145
[6] #EquationSearch#24
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\SymbolicRegression.jl:157 [inlined]
[7] top-level scope
@ C:\Users\ipunc\AppData\Local\Temp\tmp0ffjfong\runfile.jl:7
nested task error: On worker 2:
BoundsError: attempt to access 5×100 Matrix{Float32} at index [-1, 1:100]
Stacktrace:
[1] throw_boundserror
@ .\abstractarray.jl:651
[2] checkbounds
@ .\abstractarray.jl:616 [inlined]
[3] _getindex
@ .\multidimensional.jl:831 [inlined]
[4] getindex
@ .\abstractarray.jl:1170 [inlined]
[5] deg0_eval
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\EvaluateEquation.jl:90
[6] evalTreeArray
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\EvaluateEquation.jl:22
[7] #EvalLoss#1
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\LossFunctions.jl:28
[8] #scoreFunc#2
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\LossFunctions.jl:47
[9] scoreFunc
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\LossFunctions.jl:47 [inlined]
[10] #nextGeneration#1
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\Mutate.jl:139
[11] regEvolCycle
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\RegularizedEvolution.jl:57
[12] #SRCycle#1
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\SingleIteration.jl:34
[13] macro expansion
@ C:\Users\ipunc\.julia\packages\SymbolicRegression\1URtS\src\SymbolicRegression.jl:476 [inlined]
[14] #55
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\macros.jl:87
[15] #103
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\process_messages.jl:274
[16] run_work_thunk
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\process_messages.jl:63
[17] run_work_thunk
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\process_messages.jl:72
[18] #96
@ .\task.jl:411
Stacktrace:
[1] #remotecall_fetch#143
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\remotecall.jl:394 [inlined]
[2] remotecall_fetch(f::Function, w::Distributed.Worker, args::Distributed.RRID)
@ Distributed C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\remotecall.jl:386
[3] #remotecall_fetch#146
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\remotecall.jl:421 [inlined]
[4] remotecall_fetch
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\remotecall.jl:421 [inlined]
[5] call_on_owner
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\remotecall.jl:494 [inlined]
[6] fetch(r::Distributed.Future)
@ Distributed C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\remotecall.jl:533
[7] (::SymbolicRegression.var"#57#89"{Vector{Vector{Distributed.Future}}, Int64, Int64})()
@ SymbolicRegression .\task.jl:411
in expression starting at C:\Users\ipunc\AppData\Local\Temp\tmp0ffjfong\runfile.jl:7
┌ Warning: Forcibly interrupting busy workers
(Continues stopping other processes below)
As you can see from the code above, I have tested both adding my own subtraction operator and a negative operator and both options do work as expected. The order of operators does not matter and there is not any variation in the error (always worker 2). I have also tested with some other equations (where I originally found the error) and they also consistently run into the same issue. I'm really not sure what could be causing this and I think I can work around it but it should probably be looked into.
Thanks for your time :)
Which version of PySR is this? Can you reproduce it on the latest version?
Looks like "pip install pysr" had given me version 0.6.12.post1. I switched to 0.6.0 and all the tests listed above had the same results (the error still happens).
Setting multithreading=True
in version 0.6.0 causes an unrecognized keyword error (but maybe that's expected) and in version 0.6.12.post1 I get a FileNotFoundError for the HOF csv.bkup file created during that run, despite it showing up in the usual place. The full error for 0.6.12.post1 is below:
Running on julia -O3 --threads 4 C:\Users\ipunc\AppData\Local\Temp\tmpu7lrlnbh\runfile.jl
Activating environment at `C:\Python39\lib\site-packages\Project.toml`
Updating registry at `C:\Users\ipunc\.julia\registries\General`
Updating git-repo `https://github.com/JuliaRegistries/General.git`
No Changes to `C:\Python39\Lib\site-packages\Project.toml`
No Changes to `C:\Python39\Lib\site-packages\Manifest.toml`
Started!Traceback (most recent call last): ] 0.0 %
File "C:\Python39\lib\site-packages\pysr\sr.py", line 1001, in get_hof
all_outputs = [pd.read_csv(str(equation_file) + ".bkup", sep="|")]
File "C:\Python39\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Python39\lib\site-packages\pandas\io\parsers\readers.py", line 586, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Python39\lib\site-packages\pandas\io\parsers\readers.py", line 482, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Python39\lib\site-packages\pandas\io\parsers\readers.py", line 811, in __init__
self._engine = self._make_engine(self.engine)
File "C:\Python39\lib\site-packages\pandas\io\parsers\readers.py", line 1040, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "C:\Python39\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 51, in __init__
self._open_handles(src, kwds)
File "C:\Python39\lib\site-packages\pandas\io\parsers\base_parser.py", line 222, in _open_handles
self.handles = get_handle(
File "C:\Python39\lib\site-packages\pandas\io\common.py", line 701, in get_handle
handle = open(
FileNotFoundError: [Errno 2] No such file or directory: 'hall_of_fame_2021-08-24_142150.653.csv.bkup'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\ipunc\Desktop\ATOMS Lab\PySR Experiments\subtractionError.py", line 10, in <module>
equations = pysr(X, y, niterations=5,
File "C:\Python39\lib\site-packages\pysr\sr.py", line 478, in pysr
equations = get_hof(**kwargs)
File "C:\Python39\lib\site-packages\pysr\sr.py", line 1003, in get_hof
raise RuntimeError(
RuntimeError: Couldn't find equation file! The equation search likely exited before a single iteration completed.
Sorry for not following up on this. I just wanted to link this issue to this one: MilesCranmer/SymbolicRegression.jl#43. I think I may have narrowed down the issue. It's a very subtle bug caused by a backend update in 0.6.10. I will let you know if this is truly the problem.
I'm planning to revert the updated backend since it has caused several such issues. In PySR 0.6.13, the backend should be back to the stable version.
Update: just pushed.
This should be fixed in the latest versions.