rdicosmo / parmap

Parmap is a minimalistic library allowing to exploit multicore architecture for OCaml programs with minimal modifications.

Home Page:http://rdicosmo.github.io/parmap/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

process initialization and finalization

UnixJunkie opened this issue · comments

For complex things, it would be very handy to be able to register
an init function and a finalize function that would be run by each worker process:

  • the init function will be called only once by each child process, just after the
    process is created
  • the finalize function will be called only once by a child process just before
    it exit

This allows, for example, to setup and cleanup per process output files
for workers of Array.iteri or List.iteri.
Maybe those functions should be called process_setup and process_cleanup,
or some better name.

If I send a pull request for this feature, is there a chance it will be accepted?

Hi Francois,
sure... sorry for not being very reactive in this period (paper deadlines
:-))

Also, it would be nice to setup a CI connection to Travis, so we can
more easily check that a new feature does not break existing functionality

On Fri, Apr 25, 2014 at 10:19:05PM -0700, Francois Berenger wrote:

If I send a pull request for this feature, is there a chance it will be
accepted?


Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

That's a good idea!

You can be implement it as a couple of optional parameters, for
example with the names you suggest... which then need to be
added to all the different combinators...

On Sun, Apr 20, 2014 at 07:14:13PM -0700, Francois Berenger wrote:

For complex things, it would be very handy to be able to register
an init function and a finalize function that would be run by each worker
process:

• the init function will be called only once by each child process, just
after the process is created
• the finalize function will be called only once by a child process just
before it exit

This allows, for example, to setup and cleanup per process output files
for workers of Array.iteri or List.iteri.
Maybe those functions should be called process_setup and process_cleanup,
or some better name.


Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

@UnixJunkie : may you propose an specification for this feature? The ideal way would be a modified parmap.mli with the types you expect. Is it enough for the initialisation and finalisation functions to be of type unit -> unit, for example?

I guess unit -> unit should be OK.

I just committed a first version of Parmap that adds init and
finalize parameters to the parallel combinators.

Notice that init is now of type : int -> unit, and is passed
as parameter the number of the core on which the worker
is running. The init function defaults to the 'redirect' function
that is part of the original API.

The documentation is not updated yet to reflect the change.

I would appreciate feedback and testing of this new feature

On Thu, May 08, 2014 at 05:32:30PM -0700, Francois Berenger wrote:

I guess unit -> unit should be OK.


Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

I think this int is not even needed, if the parent process maintains a number of sons and
if this number is incremented in the parent process only after each successful fork.

I feel the init function should default to doing nothing
because this would mimic the current behavior and also reflect the default finalize function,
also doing nothing by default.

Actually, in the current version of Parmap, the initialisation
phase calls redirect (in the code, init i just replaced the
redirect i call), and redirect is controlled by a boolean
value: if set it redirects stdout/stderr, otherwise does nothing.

On Sun, May 11, 2014 at 06:07:59PM -0700, Francois Berenger wrote:

I think this int is not even needed, if the parent process maintains a number
of sons and
if this number is incremented in the parent process only after each successful
fork.

I feel the init function should default to doing nothing
because this would mimic the current behavior and also reflect the default
finalize function,
also doing nothing by default.


Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

Since init is now available, we can of course change this behaviour,
and have no default initialisation at all. In this case, the user that
wants to get a redirection needs to explicitly call the redirect
function as part of the initialisation. Seems cleaner indeed, I'll
give it a try.

We need to have init of type int -> unit anyway, as there is no way of
knowing the index of the core on which the process is running
otherwise.

2014-05-12 8:46 GMT+02:00 Roberto Di Cosmo roberto@dicosmo.org:

Actually, in the current version of Parmap, the initialisation
phase calls redirect (in the code, init i just replaced the
redirect i call), and redirect is controlled by a boolean
value: if set it redirects stdout/stderr, otherwise does nothing.

On Sun, May 11, 2014 at 06:07:59PM -0700, Francois Berenger wrote:

I think this int is not even needed, if the parent process maintains a number
of sons and
if this number is incremented in the parent process only after each successful
fork.

I feel the init function should default to doing nothing
because this would mimic the current behavior and also reflect the default
finalize function,
also doing nothing by default.


Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 320 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

I looked at the code quickly. Maybe Pervasives.at_exit could have been used
instead of explicitly calling finalize before each exit call.
That would call the finalize function even in case of an uncaught exception.
But I am not sure that's super useful.
I will test very soon those new functions and report about my tests, thanks a lot for the implementation.

On Mon, May 12, 2014 at 12:42:09AM -0700, Francois Berenger wrote:

I looked at the code quickly. Maybe Pervasives.at_exit could have been used
instead of explicitly calling finalize before each exit call.
That would call the finalize function even in case of an uncaught exception.

Right, thanks for the suggestion!

But I am not sure that's super useful.
I will test very soon those new functions and report about my tests, thanks a
lot for the implementation.


Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

All these changes are now committed into cb5a4b8
Looking forward for the feedback from field testing.

It is OK for me.
I tried on my computer and checked that the init and finalize functions
were called as many times as ncores.
I tried from 1 to 8 cores, which is the maximum for my machine.
Thanks a lot for implementing this !
I will try later to see if I can exploit those functions in order
to reach higher parallelization in some real world application
I have but it will take a little more time for me to test that.
I'll report about my trials.

Great, and thanks for contributing to the documentation

On Thu, May 15, 2014 at 07:16:42PM -0700, Francois Berenger wrote:

It is OK for me.
I tried on my computer and checked that the init and finalize functions
were called as many times as ncores.
I tried from 1 to 8 cores, which is the maximum for my machine.
Thanks a lot for implementing this !
I will try later to see if I can exploit those functions in order
to reach higher parallelization in some real world application
I have but it will take a little more time for me to test that.
I'll report about my trials.


Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3