Fault in stack call choice
skaller opened this issue · comments
The following code:
begin
println$ "rfor loop ------";
var v = varray[1->int] 4uz;
rfor i in (0,1,2,3) perform push_back (v,{i});
for j in 0..3 perform println$ #(v.j);
end
prints
rfor loop ------
0
2
2
-390753808
however this works rfor i in (0,1,2,3) perform println$ i;
Its hard to follow but the recursive loop is generated by the parser, and I believe it is actually correct.
The problem IMHO is that the optimiser is correctly recursing, generating independent frames, but doing a stack call instead of using the heap. Its also a bit nasty, rfor only works with iterators. So it seems to always recurse and not do tail call optimisation when it should.
The problem is that {i}
requires binding to a heap frame containing the control variable i
value at the time the closure is formed. That's what rfor
is for. The idea is to use recursion so each loop iteration is a separate frame, and the binding to it uses the state of that frame. If there is no binding it should be optimised by self-tail call optimisation to a plain for loop. The optimisation is not being done even for the simple case rfor i in (0,1,2,3) perform j+=i;
because of the generator I think.
Verified. rfor code runs fine is stack call is disable for procedures.
The question is why the routine didn't detect that {i}
formed a closure binding to the frame. One of the problems with procedures is that they can call procedures with side-effects, including storing their argument .. which is what push_back
actually does. So the analysis should be conservative and if a procedure is passed an argument containing a closure over the calling frame, the frame has to be heaped not stacked.
In fact .. the code says:
| BEXE_call _
| BEXE_call_with_trap _
->
(*
print_endline (id ^ " does nasty call");
*)
raise Unstackable
| BEXE_jump _
| BEXE_jump_direct _
->
(*
print_endline (id ^ " does jump");
*)
raise Unstackable
which is grossly over-conservative!
However this code is clearly wrong!!!
| BEXE_call (_,(BEXPR_closure (j,_),_),_)
| BEXE_call_direct (_,j,_,_)
(* this case needed for virtuals/typeclasses .. *) | BEXE_call_prim (_,j,_,_)
->
let target = let bsym = Flx_bsym_table.find bsym_table j in bsym.id in
if not (check_stackable_proc
syms
bsym_table
fn_cache
ptr_cache
label_info
j
(i::recstop))
then begin
print_endline (id ^ " calls unstackable proc " ^ si j ^ " " ^ target);
raise Unstackable
end
else begin
print_endline (id ^ " not judged unstackable by direct call to" ^ si j ^ " " ^ target);
end
Why? Because even if push_back is stackable, its argument is, in our example, a closure bound to the caller frame. So it's not good enough to check that the procedure is stackable, we have to also examine its argument. This code:
(* assignments not involving pointers or functions are safe *)
| BEXE_init (sr,_,(_,t))
| BEXE_assign (sr,(_,t),_)
| BEXE_storeat (sr,(_,t),_) ->
if
let has_vars = has_var_children bsym_table children in
let has_funs = has_fun_children bsym_table children in
let returns_fun = type_has_fn fn_cache syms bsym_table children t in
let returns_ptr = type_has_ptr ptr_cache syms bsym_table children t in
let can_stack =
(* this is the similar to a function, except we're talking
* about storing a value in an external variable instead
* of about returning it.
*)
(*
let p = function | true -> "true" | false -> "false" in
print_endline ("has_vars " ^ p has_vars ^ " ret ptr " ^ p returns_ptr
^ " has_funs " ^ p has_funs ^ " ret fun " ^ p returns_fun );
*)
match has_vars, returns_ptr, has_funs, returns_fun with
| _ , _ , true , true
| _ , true , true , _
| true , true , _ , _ -> false
| _ -> true
in
can_stack
then
()
else
raise Unstackable
already does this analysis for assignments. Obviously push_back
is also a kind of assignment.
That fixed it! Just do the same analysis for the type of the argument passed to the procedure.