rdicosmo / parmap

Parmap is a minimalistic library allowing to exploit multicore architecture for OCaml programs with minimal modifications.

Home Page:http://rdicosmo.github.io/parmap/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fatal error: exception End_of_file

sfmatt opened this issue · comments

Hi Roberto,

Thanks a lot for parmap which I was using successfully until recently. With v1.0-rc5 array_float_parmap returns the above exception for a source array of ~90K elements, even with ncores = 1. There does not seem to be any memory issue as around half of the computer's memory is free when the exception is raised. This is on Ubuntu 14.04 64bits btw.

Matt

My apologies for the false alarm. For some reason the opam-installed version of parmap was shadowed by an older version in which the problem was not yet corrected.

Also regarding issue #18: Fatal error: exception Failure("input_value_from_block: bad object"), it's caused by one of the children processes aborting abruptly (in my case an unexpected nan value in some complex computation causes the process to abort without warning/stack trace).

Dear Matt,
thanks for auto-fixing this :-)

Let me remark here that up to now parmap does not handle gracefully
abnormal termination of one of the workers, as in #18.

It would require some work to make exceptions in the workers come up
as exceptions in the main program, and we did not do this yet, but
contributions are welcome

On Mon, Oct 27, 2014 at 10:42:38PM -0700, sfmatt wrote:

My apologies for the false alarm. For some reason the opam-installed version of
parmap was shadowed by an older version in which the problem was not yet
corrected.

Also regarding issue #18: Fatal error: exception Failure
("input_value_from_block: bad object"), it's caused by one of the children
processes aborting abruptly (in my case an unexpected nan value in some complex
computation causes the process to abort without warning/stack trace).


Reply to this email directly or view it on GitHub.*

Alas Roberto I'm a decent debugger but a poor programmer unfortunately. The
best I can do to contribute is to give you 2 simple programs to reproduce
the exceptions in the latest parmap version:

Fatal error: exception End_of_file:
let l = [1;2;3]
let f x = exit 0
let l' = Parmap.(parmap ~ncores:2 ~chunksize:1 f (L l))
let () = List.hd l' |> print_int

Fatal error: exception Failure("input_value_from_block: bad object"):
let l = [1;2]
let f x = exit 0
let l' = Parmap.(parmap ~ncores:2 ~chunksize:1 f (L l))
let () = List.hd l' |> print_int

As you can see there are no exceptions involved in the workers only an exit
call. In both examples if we replace let f x = exit 0 with let f x =
failwith "FAIL" we get an explicit error message:
[Parmap]: error at index j=0 in (0,0), chunksize=1 of a total of 1 got
exception Failure("FAILED") on core 0
[Parmap]: error at index j=0 in (1,1), chunksize=1 of a total of 1 got
exception Failure("FAILED") on core 1
[Parmap]: aborting due to exception on core 0: Failure("FAILED")

IMHO parmap deals perfectly fine with exceptions in the workers as it is.
Now perhaps the same (or very similar) error messages could be used in the exit
0 scenario(s)?

Thank you again for parmap!

Matt

On Tue, Oct 28, 2014 at 12:43 AM, Roberto Di Cosmo <notifications@github.com

wrote:

Dear Matt,
thanks for auto-fixing this :-)

Let me remark here that up to now parmap does not handle gracefully
abnormal termination of one of the workers, as in #18.

It would require some work to make exceptions in the workers come up
as exceptions in the main program, and we did not do this yet, but
contributions are welcome

On Mon, Oct 27, 2014 at 10:42:38PM -0700, sfmatt wrote:

My apologies for the false alarm. For some reason the opam-installed
version of
parmap was shadowed by an older version in which the problem was not yet
corrected.

Also regarding issue #18: Fatal error: exception Failure
("input_value_from_block: bad object"), it's caused by one of the
children
processes aborting abruptly (in my case an unexpected nan value in some
complex
computation causes the process to abort without warning/stack trace).


Reply to this email directly or view it on GitHub.*


Reply to this email directly or view it on GitHub
#29 (comment).