davidrmiller / biosim4

Biological evolution simulator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Request Challange Files from Youtube Video

PascalCorpsman opened this issue · comments

Dear Dave,
i ported your biosim Application to FPC ( https://github.com/PascalCorpsman/biosim4_FPC_translation ) to be able to adjust your code with my own needs in my most favorite programming language (and also to be able to run it on Windows and Linux)
To Validate my port i try to reproduce the results you show on your YouTube video.
I stored the solutions i already found here : https://github.com/PascalCorpsman/biosim4_FPC_translation/tree/main/Challanges

Timestamp 7:57 - 14:38 "Challange_1_Right_Half.ini" was no problem and reproducable

Timestamp 27:16 - 33:40 "Challange_13_Left_Right_eights.ini" i also was able to reproduce this (with only 5000 - 7000 generations)

Timestamp 35:54 - 40:52 Brain sizes

This is where i got into trouble, when having 32 or more inner neurons the individuals start like expected and after a view hundred generations they start to not go to the outside walls anymore but keep a little bit more in the middle and then the surviver rate always drops below 50% and stays there ( even the 8 Inner neuron version does this, but the effekt is not that hard).

So can you please share your .ini files so that i can try to reproduce your results with my simulator ?

Regards
Uwe Schächterle

Hi @PascalCorpsman , that was quite a porting effort! I'm impressed you got it working so well.

Unfortunately, I no longer have a copy of the exact code or the parameters I used for the scenarios seen in the video. At that time, the program was in constant development. I recall that I often had results like you described, where certain neural net topologies resulted in poor survival rates. Sometimes a small change in one of the parameters or in the action-evaluation code would make a dramatic change in the survival rate.

At this point in time, we have a nice python-based test suite contributed by @venzen that works with the current repository code. The documentation is in the tests/ subdirectory. I often use that test harness when experimenting with changes. If the floating point library implementations are sufficiently similar, then I would think that the test named "deterministic0" should pass when run against the Pascal port. That test is single-threaded and is configured to use a deterministic "random" number sequence for testing purposes. The other tests in that suite might also pass, but they are multithreaded, and I don't know how that will affect the tests on the Pascal port.

Thanks for letting us know about this port.

Ok, if i get this right, the "test" changes the default biosim.ini (your project provides) according to the test spezifications. Then runs the simulation and compares the definied results.

In the case of [deterministic0] the results should be:
result-generations = 1000
result-survivors-min = 251
result-survivors-max = 251
result-diversity-min = 0.080
result-diversity-max = 0.081
result-genomesize = 8
result-kills = 0

My simulator are:
result-generations = 1000
result-survivors = 227
result-diversity = 0.0154
result-genomesize = 8
result-kills = 0

So i would say the test fails, if i also look at the resulting .avi file it seems that the "individuals" again form strange patterns.

Until now i was not able to compile or run your code, i did this all by looking at your code with a simple text editor. But i think i should be able to run, maybe debug your code to be able to compare it in detail and see where's the different. This gives me the new task of setting up a device that is capable of doing this.

I attached the last Image of the last generation for comparison :)

gen1000_simstep_300

We're getting similar results (I'm impressed how much you got working). A different random number sequence could cause slight differences, but it appears that you implemented the same random number algorithm. In "deterministic" mode, it should produce the same pseudo-random sequence. These results appear to be due to something else.

When I was debugging the C++ version, I found that the test functions in unitTestBasicTypes.cpp were extremely valuable. If any of those test cases failed, then the rest of the code execution would be unpredictable.

it took a bit, but finally i got your docker file to work, i had to disable the avi encoding as the docker image is not capable of generating h256 videos, also png is not supported, so i changed this to .bmp and then was able to compile your code and validate that your code reaches the deterministic0 testcase.

i ran
unitTestConnectNeuralNetWiringFromGenome();
unitTestGridVisitNeighborhood();
unitTestBasicTypes();

in both code bases yours and mine, here are the Results:

unitTestConnectNeuralNetWiringFromGenome(); -> Your code gives no results, mine plotted
SENSOR 0 -> NEURON 0 at 0
SENSOR 1 -> NEURON 2 at 2.199951172
SENSOR 13 -> NEURON 0 at 3.299926758
NEURON 1 -> NEURON 2 at -3.600097656
NEURON 1 -> NEURON 1 at -2.5
NEURON 2 -> NEURON 0 at -1.400024414
NEURON 0 -> NEURON 0 at -0.3000488281
NEURON 2 -> NEURON 0 at 0.7999267578
SENSOR 0 -> ACTION 1 at 1.899902344
SENSOR 2 -> ACTION 12 at 2.099975586
NEURON 0 -> ACTION 1 at 3
NEURON 1 -> ACTION 2 at -4
=> your testcode here seems to be old as it uses float weights where ints should stand, so i changed this in both yours and mine
Both plott now:
SENSOR 0 -> NEURON 0 at 0
SENSOR 1 -> NEURON 2 at 2
SENSOR 13 -> NEURON 0 at 3
NEURON 1 -> NEURON 2 at 4
NEURON 1 -> NEURON 1 at 5
NEURON 2 -> NEURON 0 at 6
NEURON 0 -> NEURON 0 at 7
NEURON 2 -> NEURON 0 at 8
SENSOR 0 -> ACTION 1 at 9
SENSOR 2 -> ACTION 12 at 10
NEURON 0 -> ACTION 1 at 11
NEURON 1 -> ACTION 2 at 12
-> Pass

unitTestGridVisitNeighborhood(); -> not pass
Due to rounding errors my version "overfits" the circles
after a vew changes now the code also passes with the exact same coords as yours
-> Pass

unitTestBasicTypes();
The simulator did not use polar type so i skipped this tests, all the others run without any harming
-> pass

But Running the deterministic0 testcase still fails :(

Next thing i will try, your simulation plotts a genome at the end of the simulation, my simulation is capable of reading this genomes in
i am curious what will happen if i use this results, but to be able to get any usefull results i need first to run your simulation until the diversity has dropped to near 0 so this will take a while g. But i will keep you informed about the results ...

OK now i am confused.
Below you see the Plot of a individual of your simulator. I added behind the "->" the hex value of the weight.
The mapping of the first 16-Bit i am ignoring right now, but i expect the value of the weight to be found in the genomes anywhere and this is not the case why ? Even if i search bit reversed i am not able to find your values


Individual ID 3
570eb37a ce33ed60 03d3177b 960c49ca 750f0f55 6bcecf8f 443a6099 39f4436f

Osc N0 17466   -> 0x443A
N1 MvE 22286   -> 0x570E
N0 LPD -12749  -> 0xCE33
Sfd MvN 979    -> 0x03D3 -> %0000 0011 1101 0011 -> %1100 1011 1100 0000 -> 0xCBC0
N2 Mrn -27124  -> 0x960C
Lx Res 29967   -> 0x750F
LPf Res 27598  -> 0x6BCE
Osc MvY 14836  -> 0x39F4

Doing the same thing with my simulator leeds to this result:

Individual ID 3
570eb37a ce33ed60 03d3177b 960c49ca 750f0f55 6bcecf8f 443a6099 39f4436f

N0 N1 -19590   -> 0xB37A
Bfd N0 -4768   -> 0xED60
Ly N0 18890    -> 0x49CA
N0 N0 3925     -> 0x0F55
N0 MRL 6011    -> 0x177B
N1 MvR -12401  -> 0xCF8F
N0 Mrn 17263   -> 0x436F

The Genome 443A6099 will be dropped by my simulator, at the moment i don't know why, but all the other weights can be found in the orig genome if you compare the results ..

So the next step will be to figure out why this is the case ..

It's good news that the unit tests are giving us the same results. Looking forward to what you discover.

Ok i got a new approach.

Debugging your multi thread docker app is to difficult for me (not to say impossible) so i searched for a way to reach debug information's and came up with this :).

first i took your code and dropped everything out that has to do with image processing and threads. This gives me a pure c++ project which is easily to be compiled with a simple makefile (using VSCode).

I run this with the deterministic0.ini and expected it to produce the same results as your orig application does.
-> no pass, the number of survivers match, but the diversity was way much better than yours.

by the way, giving up is not an option!

So i started thinking of what i dropped out and what now could be the difference. As i use your code and only changed a few dozens lines the "Bug" can not be that big.
The Solution was brought by the documentation of omp_get_thread_num, i assumed that this is the number of threads, but it is not. It is a single number for each thread starting with 0. And this was my difference, i set it to 1 as i expected it the number of active threads not their index (classical off by one ;) ).

Fixed that, run my "simple port". As i am now "blind" now, i am only available to compare the epoch-log.txt files. So i asked meld to show the difference of your simulations logfile and mine. And finally they are the same (y).

Now i have a simple C++, single Thread application, which is debug-able by VSCode. Now i have everything needed to start a detailed debug run step by step to hopefully find the Bug / difference in my FPC code.

Suggesting that this will take a while, i will stay you informed g.

So i got my first finding, but not as expected in my code, it is in yours, to me it looks like a "Bug" but i am not shure, so i show it to you ;)

Finding1

To reproduce set a breakpoint where i did in the line ( in genome.cpp on the shown line " nnet.neurons.clear();") and step by step through the loop (using the deterministic0 testcase ) [ luckiliy this is the first time the code runs so it is really only setting the breakpoint and seeing what happens ..)

i stepped through the code (left = Red) is before the run through the loop, as you can see i would expect 2 neurons to be created, but the loop iterator starts with index 0 and therefore requests a not existing entry in the nodeMap, this results that the nodeMap creates a new entry 0 (right part of the image) with all zeros. => This results in 3 neurons beeing created where there should only be 2. As the number of neurons affect absolutely everything the indiv does, it is clear why our code behaves different -> now the fun part begins and i will try to "replicate" this behavior with my code..

Ok this one is much harder,
Finding2
i run the simulation step by step and as you can see the C++ tanh function gives a slightly other result than the fpc version. One could think that this is no Problem, but unfortunatunelly it is. This tiny difference let the indiv not move down, and therfore it will not be able to reproduce :(
So next challenge is to include a tanh variant into my code which gives exact the same values as the c++ version in all cases..

Thanks for looking at the code in Indiv::createWiringFromGenome(). It's a complicated function -- it converts a list of genes into a list of neurons (a.k.a. nodes), then removes those with no connections, then renumbers the remaining neurons, then converts them into a list of ordered connections. I don't think the code references containers out of bounds, but there could be something else amiss there.

About the tanh() differences, it looks like our floating point libraries are the same to 6 or 7 decimal digits of accuracy. I would expect that to cause only occasional very small difference in results.

Finally i made it.
First i wrote a trace program that took around 200k datapoints while running the orig C-Code.
Than i wrote a "tolerant" (up to 0.0015 in float values) trace comparer for my FPC-Version that runs the code and compares the trace log on the same control points as the trace took them of the C-Code version.
This finally brought up all differences and made me able to get the final image you see below ;)

Here are some of my conclusions i want to share to you:

  • float / double does not matter, the results are more or less the same (as we both expected), only my tracer needed that high precision
  • Your version is not capable of setting the value "graphLogUpdateCommand" by .ini file -> this always results in a crash when running on Windows plattforms that do not have installed cygwin and gnuplot
  • The code initializes multiple times the random number generator (when running in single thread mode)
  • Indiv.BirthLoc is not set in your code (the code is commented out, but why ?)
  • It seems that C++ code don't care about div by zero errors, FPC-Code does so i needed a bit more errorhandling here

My Bugs / things that i had to fix to get it working (in case someone ever tries to do the same as i did and reads this post with more or less the same errors).

  • in FPC there are no bitwise unions, my TGene is 10Byte in size and not 4Byte as the C++ version is -> This affects the hammingDistanceBits and hammingDistanceBytes routine if ported wrong (which i unfortunatunelly did) [ by the way i would recommend you to also implement TGene as i did as the access to the memory is much faster in this way]
  • C++ truncates float values when converting them into integers, do not round them !!

And here it is the main bug i did:
In spawnNewGeneration, all the parent candidats will be sorted by their fitness, as FPC does not have its own sorting method i implemented a Quicksort algorithm by hand. My mistake was that the sort result was not best to worst, but worst to best -> This results that instead of letting the "best fit" genoms beeing parents, always the "worst-fit" ones where choosen.
=>
This only affects on challanges where the Fitness is weighted, so my first tests passed, as they are not weighted.
But the really cool thing is (in my opinion), that even that i implemented the wrong sorting and therefore the worst genes survived, the results from my initial posting are not that bad (y) nature is cool.

-\
-/ Unfortunatelly my sorting algorithm of the parents also not gives the same results as the C++ version, this is because if indiv 1 and 2 have the same fitness. For the sorting result they are of same value and therefore the "order" of all indivs with the same fitness value is random (and always different to the C++ version). This always results in different parents choosen for the next generation, which is relevant in deterministic mode and makes FPC-Version none compareable to the C-Version.


After all this said and done i also revisited the above mentioned point of the Mapping function thing.

Ok this one is much harder, Finding2 i run the simulation step by step and as you can see the C++ tanh function gives a slightly other result than the fpc version. One could think that this is no Problem, but unfortunatunelly it is. This tiny difference let the indiv not move down, and therfore it will not be able to reproduce :( So next challenge is to include a tanh variant into my code which gives exact the same values as the c++ version in all cases..

As my code now runs smoothly i was able to test with my fix and without my fix.
The results are as following:
Both codes work.
But the in my opinion "wrong" version gives slightly better results (higher survivor rate).
My conclusion to this is:
The "Wrong" version creates more neurons than the other version, and as you already figured out in your youtube video, more neurons = better survivor rates.
So even though that i don't like this "feature" it gives the creatures a better surviver rate and as the upperlimit of created neuros is always p.maxNumberNeurons it is within the ruleset.


As soon as i cleaned up all my code mess i will push my changes to my github repository with the "final" results.

Thank you very much for helping me and ginving me the right hints for solving this issue.

And now here is my image300 of generation 1000 of deterministic0 testrun (y)

Final_Working

Congratulations on the progress!

Topic is done, don't want to pollute your open Issue list ;)