paulfitz / cosmicos

Sending the lambda calculus into deep space

Home Page:https://cosmicos.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Something wrong with encoder

joha2 opened this issue · comments

commented

The bit codes for the commands are not properly chosen in my clone: After some substitutions I got
from index.txt

 (1)(1)
 (1)|(1)(0)
 (1)|(1)(1)(0)
 (1)|(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)|(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
 (1)(1)
 (1)|(1)(0)
 ....

which is obviously not correct because is:int should have another bit code than is:square.
Comparison with website message http://cosmicos.github.io/wrapped.txt gives:

(0)(100111)
(100111)|(101010)(0)
(100111)|(101010)(1)(0)
(100111)|(101010)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(100111)|(101010)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(0)
(0)(101111)
(101111)|(101010)(0)
....

Where does the encoding takes place for further debugging? Btw. here one can see that the unary definition is missing since there is no (0)(101010) line :-)

Well, TBH I'm not entirely sure, I've been working on that for a while, see #3. My best idea would be to take a look at https://github.com/CosmicOS/cosmicos.github.io/blame/master/_includes/wrapped.txt and use that date to figure out the range in which the change took place. Just looking at it, it seems to be Aug. 24, 2014. Luckily, there hasn't been that many changes to the core of the code since then. Of course, the wrapped.txt file could have been precompiled before that date and then just uploaded. (Meaning the blame date is wrong) If you need help looking through the commits, I'd be willing to help.

commented

Thank you, I will try to figure it out myself in the first place. But I'm not experienced in using Haxe (in fact I just heard from it by inspecting CosmicOS code), therfore I will ask you for your assistance later for sure :-)

commented

Encoding takes place in https://github.com/paulfitz/cosmicos/blob/master/transform/cosmicos/Parse.hx#L226 in the functions codify and codifyInner. In the codify function we could change the line termination symbol to be more coherent with the inner encoding. I also figured out where the bit codes are generated (in one of the if-then-else constructions), but I have no glue how to debug this function. Any ideas?

commented

By using git bisect, which is a great tool, I traced the error back to
fe8f330

Unfortunately it is a very large commit and I don't fully understand the building principles in this project. Therefore I need your assistance @aw1231 to check for the most probable location of the bit encoding error.

Good idea to use bisect. Today is a holiday here, so I'll be busy with that, but just off the top of my head, the files under the transform folder would probably be the best bet. The one in tools may be useful but IDK.
Here are my guesses (which may be wrong)

  • Evaluate.hx
  • Parse.hx
commented

Yes, I also figured out that these files may contain the problem.

Just one question: In a typical C++ or python program I would insert a bunch of cout or print commands respectively. I also know that in haxe one can use the trace command for an equivalent output. But after that I have no idea how to compile the program and check the console output. Do you know how to do that? How do you check the results of your modifications?
Could you please outline a short work flow?

The complexity of CosmicOS is much higher than in the tutorials shown under the getting started section at the haxe website.

Are you asking how do you compile a haxe program? Or how to run it? Or both?

commented

Both: In particular for CosmicOS. Because the build process contains very much intermediate stages. How do you perform a typical run -> debug -> compile -> run work flow for CosmicOS? And which tools do you use? The answers to these questions would be of very large benefit for me. Thanks in advance!

Well, usually what I would do is get it to output something to STDOUT and then use travis to iterate. I didn't have a working linux box until recently, so everything I did was through travis. I probably could come up with a better way, but yeah. Usually what I would do is just continuously print the variable of interest. In terms of stepping through the program line by line, I'm sure haxe may have something for that. I'll look into it.

Haxe generates JS files in the current implementation. Haxe can export files to allow you to compare the javascript back to the original haxe. See: http://haxe.org/manual/target-javascript-debugging.html

Let me know if you need something better than that.

commented

Thank you for your detailed answers! Yeah my first thought was also "send something to STDOUT and see what it does" but somehow the trace command is not executed during the build process with cmake .. && make. I think I'll further play around a bit with it and try the IDEs your wiki link suggested. Maybe after I found an appropriate run/debug cycle I will write a wiki entry ;-)

Ok. Good luck!

commented

I found a more or less easy way to debug the hx-Code, see the new wiki page. It seems that https://github.com/paulfitz/cosmicos/blob/master/transform/cosmicos/Parse.hx#L143 is the root of all evil during the encoding procedure (vocab.getBase gets the int code). Because for example the number 5 is converted there from a string into an int and afterwards encoded into an appropriate bit number. I assume that during the conversion from integer ids of the symbols to stringy symbols there were inconsistencies induced. I tried to correct this error but I get stuck in the evaluateInContext function. This is because there is one pure int-branch (which is called after converting vocab.get into vocab.getBase) and several branches where the code checks for ids of define, assign, if, lambda and so on. I don't know which philosophy lies behind this code, i.e. whether there are integers intended to be used as ids or not. Maybe we should discuss this together with @paulfitz.

When you say

I assume that during the conversion from integer ids of the symbols to stringy symbols there were inconsistencies induced.

Do you mean the "stringy symbols" are the bit numbers? I think what you should do is check whether you get any "??" as this seems to indicate no match with the vocab.

Oh and enabling this trace might be useful.

//trace("working on " + Parse.deconsify(e0));

commented

With 'stringy symbols' I mean 'texty symbols', sorry. These are the ones mentioned in commit fe8f330 which probably introduced the malfunction. The output of some traces for the command is:prime 5; is

is:prime 5;
codifyLine
found string "is:prime"; looking up vocab
found code "is:prime"
converting number 5
to 5
list after encodeSymbols
[ 'is:prime', 5 ]
return codified list
1
101
213210132233

As you can see is:prime stays is:prime after encodeSymbols therefore I think something is wrong with the call of vocab.get(...). This has to be vocab.getBase(...) to convert is:prime into an integer code to convert it properly into a bitcode.

So, the code should be changed to vocab.getBase() instead of vocab.get()? Or are you saying that vocab.get() is calling vocab.getBase() and something weird is happening there?

commented

As far as I understood, the code should change vocab.get into vocab.getBase to obtain bitcodes which are derived from integers. But this introduces new difficulties, because in the function evaluateInContext where the interpretation takes place there is a check for the id of the command symbol. Those ids are initialized with fixed values for lambda,if, and so on. Before the mentioned commit these were numbers, but now they are strings. In this part of the code this is OK, because there is an extra if construction for integer ids. But it seems not to fit into the bit encoding branch which relies on integer ids. I just don't get it :-)
Sorry for my bad and maybe hard to understand English!

No problem, I find it very complicated too. Could you perhaps try git revert fe8f330 (or whichever commits are necessary) and then see if everything still works/passes travis, as well as fixes functionality? If so, you might have a case to revert this commit, which we could then discuss with @paulfitz.

commented

So I should revert on my fork and check on my fork if it still works with travis? Is this correct?
Could you please give me a short introduction how to initialize travis also on my fork?

Yep. Of course do your own testing, such as whether reverting actually fixes the problem. Travis only checks for build errors. If I remember correctly, all you have to do is go here: https://travis-ci.org/ and sign up, giving it permission to build your cosmicos repo. The build instructions are already in the repository so once permission is granted, it should happen automatically, but let me know if that doesn't happen.

commented

According to git bisect the problem is solved by reverting to 1 commit before fe8f330. I will try travis and let you know the results :-)

When you use git revert make sure you are specific as to the commit as we don't want to revert every change made since then.

commented

To be honest, I did a git revert only once. What do you mean by "make sure you are specific as to the commit"? Is there the possibility to only revert parts of the code?

According to http://stackoverflow.com/a/2318847 , git revert acts like a cherry-pick by only reversing the changes of that specific commit.

commented

Yeah I already tested it and saw how it works in principle. My question here is: We want to retrieve Evaluate.hx and Parse.hx from before the malfunctioning commit but all other files should stay in their current state?

Correct, because as you have stated that specific commit is the issue and thus if we undo just that commit it should, hopefully, fix the problem.

commented

OK, I'll try this. But I think that the code afterwards is broken, too :-)

why do you think so?

commented

The root of the bug is that Vocab did not change its functionality in a transparent manner, because the idea was to go from integer symbols to texty ones. Therefore only retrieving Evaluate and Parse from the former commit is maybe not sufficient. I don't know how to describe it more precisely.

Okay, that makes sense. So do you know which additional commits may need to be reverted to solve the issue?

commented

No, I don't know, because the malfunctioning commit is very "deep" and it is probably only the final commit of major changes in the code. That's why I suggested to discuss these things also with @paulfitz.

I simply don't understand some of his coding constructions sufficiently to point exactly on the buggy code. As long as @paulfitz is not available I think the best would be to trace through the code further and try to find the bug(s). Do you agree?

I apologize for my "hands off" approach. I generally don't have a lot of time to devote to this project, but I still love it and will help when I can.

commented

It's no problem at all! At the end of the day this is a 'spare time project' (if this is the correct word), and I really understand that all of us can only work sporadically on such projects due to job, family, and other projects.
Nevertheless it is also good to talk about the project and to consider new ideas together!

commented

Now I restored the encoder functionality by removing the line Parse.encodeSymbols(lst,vocab); from the function evaluateLine in Evaluate.hx, but the compilation of the message breaks due to is:int | unary 0; evaluating to false.

@aw1231 do you have an idea why the encodeSymbols function is called in
evaluateLine? It seems not to do anything.

Edit: encodeSymbols appears to transform e.g. string numbers into integer values.

I think in general it is no good idea to mix up evaluation and encoding if not absolutely necessary. But I don't see the necessity. See #15.

Hi all, just to say I'm back from vacation and hoping to catch up shortly.

commented

:D sorry for flooding your whole project page ;-)

I find it somehow appropriate for a project like this to experience long periods of silence and then sudden bursts of communication :-)

Just to confirm there's a problem here, @joha2 your investigations have been very helpful, thank you. Looks like I messed up when working on a partial v2 of the message, https://cosmicos.github.io/next.html, and merged it before the logic was fully working. Will see if I can pick up the pieces, if not will undo the relevant changes.

commented

Ah I see. Is the message broken due to the introduced texty symbols? What do you think of the split up of your program into two different branches: evaluation and encoding?

@joha2 I think the current situation can be cleaned up a lot. A strict separation would require some changes. For example there's a macro for inserting the results of computation directly into the message - but that could be done a different way. The message also eventually contains self-reference, and to check that in evaluation would require access to encoding. In a previous version of cosmicos, the message was completely encoded first, and then decoded to be evaluated. The current version takes a shortcut which led to the tests missing the fact that the last step of encoding was completely broken. I do have a fix in hand, just cleaning it up.

commented

Thanks for clearifying! I only analyzed the first few lines of the message and thought: before we did not remove the encoding bug it is not useful to take care of the other parts. Therefore I did not see that the encoding/evaluating process is that strongly coupled.

Could you please provide a short, summarizing overview over the build process of the message in the wiki? I still did not fully understand it :-(

commented

I checked the last version of your code. The encoder works now, therefore maybe we can close this issue.