Cores use for concurrent programs

Question

Cores use for concurrent programs

j-m-nunes opened this issue 10 years ago · comments

When I run two or more copies of the same program, the subprocesses (say 4) for the different program copies always share the first 4 cores although there are other cores free. Is this the expected behaviour, that all programs request the first (4) cores?
Thanks in advance,

Roberto Di Cosmo · Answer 1 · Wed Nov 05 2014 20:16:45 GMT+0800 (China Standard Time)

Very good remark! Indeed, the Parmap code uses the kernel-level interface
to lock each process to a different core, but does so in a very simple way:
if there are n cores and k processes, it locks the k processes to the first
k cores.

When you run two programs, all their processes will try to use the same
cores.

The simplest solution would be to have the client program accepting a
specification of a set of cores on the command line of the program,
something like

 --coreset=0-3,5,8

and have parmap use only this set, with, of course, some sanity checks
(core 8 really exists, etc.), and defaulting to the set of all cores.

Should not be difficult to implement... if anybody wants to give a try, I
would add something like

(** {6 Setting and getting the default value for the core set } *)

val set_coreset : int list -> unit

val get_coreset : unit -> int list

and then change all the code in Parmap that does

(* spawn children *)
for i = 0 to ncores-1 do
...
done
to
List.iter (fun i ->
...) coreset

Roberto

2014-11-05 12:36 GMT+01:00 j-m-nunes notifications@github.com:

When I run two or more copies of the same program, the subprocesses (say
4) for the different program copies always share the first 4 cores although
there are other cores free. Is this the expected behaviour, that all
programs request the first (4) cores?
Thanks in advance,

—
Reply to this email directly or view it on GitHub
#30.

Roberto Di Cosmo

Professeur En delegation a l'INRIA
PPS E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW : http://www.dicosmo.org
Case 7014 Tel : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments:
MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 320 (3rd floor)
Batiment Sophie Germain
Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

François Bérenger · Answer 2 · Tue Nov 28 2017 09:22:43 GMT+0800 (China Standard Time)

Another solution might be to optionally disable process pinning to core.
It's funny, but I have never observed the behavior seen by this user.
@j-m-nunes are you on Linux?

j-m-nunes · Answer 3 · Tue Nov 28 2017 15:50:33 GMT+0800 (China Standard Time)

On 28/11/17 02:22, Francois BERENGER wrote: Another solution might be to optionally disable process pinning to core. It's funny, but I have never observed the behavior seen by this user. @j-m-nunes <https://github.com/j-m-nunes> are you on Linux?

... Yes. I've seen it with both amd and intel processors. Eventually we stopped using more than one core and moved to a cluster. I'll check it again and reply to you in a couple of days. Can you provide some "direct" link to checking/changing this process pinning? Many thanks,

Roberto Di Cosmo · Answer 4 · Tue Nov 28 2017 15:59:28 GMT+0800 (China Standard Time)

We did extensive testing with Marco Danelutto when designing parmap, and without pinning performance degrade seriously: with symmetric multiprocessing, the OS keeps moving processing around, that adds overhead and kills the cache. It would be better to come up with a minimal example that exhibits the problem, and look at it in more details.

…

On Tue, Nov 28, 2017 at 07:50:34AM +0000, j-m-nunes wrote: On 28/11/17 02:22, Francois BERENGER wrote: > Another solution might be to optionally disable process pinning to core. > It's funny, but I have never observed the behavior seen by this user. > @j-m-nunes <https://github.com/j-m-nunes> are you on Linux? ... Yes. I've seen it with both amd and intel processors. Eventually we stopped using more than one core and moved to a cluster. I'll check it again and reply to you in a couple of days. Can you provide some "direct" link to checking/changing this process pinning? Many thanks, — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.*

-- Roberto Di Cosmo

------------------------------------------------------------------ Professeur (on leave at/detache a INRIA) IRIF E-mail : roberto@dicosmo.org Universite Paris Diderot Web : http://www.dicosmo.org Case 7014 Twitter : http://twitter.com/rdicosmo 5, Rue Thomas Mann F-75205 Paris Cedex 13 France ------------------------------------------------------------------ Office location: Paris Diderot INRIA Bureau 3020 (3rd floor) Bureau C123 Batiment Sophie Germain Batiment C 8 place Aurélie Nemours 2, Rue Simone Iff Tel: +33 1 80 49 44 42 Metro Bibliotheque F. Mitterrand Ligne 6: Dugommier ligne 14/RER C Ligne 14/RER A: Gare de Lyon ------------------------------------------------------------------ GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

François Bérenger · Answer 5 · Tue Nov 28 2017 16:13:18 GMT+0800 (China Standard Time)

@j-m-nunes just to check quickly if it solves your problem: remove calls to Setcore.setcore in parmap.ml
(there is only one call site).
then recompile and reinstall your modified version of parmap, then same for your test application.

François Bérenger · Answer 6 · Tue Nov 28 2017 16:30:40 GMT+0800 (China Standard Time)

Really recent linux kernels might shuffle processes less (natural CPU affinity), so disabling CPU affinity might not be that stupid for users who experience this problem (and who run a modern kernel).
Another possible simple fix would be to setcore the process to whatever core it is currently running (and hope that the operating system is not stupid in deciding where to start a new process).
The proper fix is not simple: it means parmap users need to be aware of where are current parmap processes running on their machine (even the ones launched by other users on the same computer...).

Roberto Di Cosmo · Answer 7 · Tue Nov 28 2017 16:33:47 GMT+0800 (China Standard Time)

Ok, let's try this, then! Otherwise, there is a patch I wrote for Julia to enable fine grained control on where to pin processes (important on architectures where memory banks and cpu have non obvious affinities), I will try to dig it out and push it in an experimental branch. 2017-11-28 9:30 GMT+01:00 Francois BERENGER <notifications@github.com>:

…

Really recent linux kernels might shuffle processes less (natural CPU affinity), so disabling CPU affinity might not be that stupid for users who experience this problem (and who run a modern kernel). Another possible simple fix would be to setcore the process to whatever core it is currently running (and hope that the operating system is not stupid in deciding where to start a new process). The proper fix is not simple: it means parmap users need to be aware of where are current parmap processes running on their machine (even the ones launched by other users on the same computer...). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#30 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAp-v_TNosLyz6ZIi16eHJvq7VjPx2vkks5s68SxgaJpZM4C3PRJ> .

-- Roberto Di Cosmo Director Software Heritage

------------------------------------------------------------------ Professeur (on leave at/detache a INRIA) IRIF email : roberto@dicosmo.org Universite Paris Diderot web : http://www.dicosmo.org Case 7014 Twitter : http://twitter.com/rdicosmo 5, Rue Thomas Mann F-75205 Paris Cedex 13 FRANCE ------------------------------------------------------------------ Office location: Paris Diderot INRIA Bureau 3020 (3rd floor) Bureau C123 Batiment Sophie Germain Batiment C 8 place Aurélie Nemours 2, Rue Simone Iff Tel: +33 1 80 49 44 42 Metro Bibliotheque F. Mitterrand Ligne 6: Dugommier ligne 14/RER C Ligne 14/RER A: Gare de Lyon ------------------------------------------------------------------ GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

Roberto Di Cosmo · Answer 8 · Tue Nov 28 2017 16:39:47 GMT+0800 (China Standard Time)

Actually, quite a simple change, so I'm pushing it to master. Starting from commit b030c05, you can now use val set_core_mapping: int array -> unit (** [set_core_mapping m] installs the array [m] as the mapping to be used to pin processes to cores. Process [i] will be pinned to core [m.(i mod Array.length m)]. *) to control precisely where each process will go; using this array, you can implement any kind of tricks. 2017-11-28 9:33 GMT+01:00 Roberto Di Cosmo <roberto@dicosmo.org>:

…

Ok, let's try this, then! Otherwise, there is a patch I wrote for Julia to enable fine grained control on where to pin processes (important on architectures where memory banks and cpu have non obvious affinities), I will try to dig it out and push it in an experimental branch. 2017-11-28 9:30 GMT+01:00 Francois BERENGER ***@***.***>: > Really recent linux kernels might shuffle processes less (natural CPU > affinity), so disabling CPU affinity might not be that stupid for users who > experience this problem (and who run a modern kernel). > Another possible simple fix would be to setcore the process to whatever > core it is currently running (and hope that the operating system is not > stupid in deciding where to start a new process). > The proper fix is not simple: it means parmap users need to be aware of > where are current parmap processes running on their machine (even the ones > launched by other users on the same computer...). > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#30 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AAp-v_TNosLyz6ZIi16eHJvq7VjPx2vkks5s68SxgaJpZM4C3PRJ> > . > -- Roberto Di Cosmo Director Software Heritage ------------------------------------------------------------------ Professeur (on leave at/detache a INRIA) IRIF email : ***@***.*** Universite Paris Diderot web : http://www.dicosmo.org Case 7014 Twitter : http://twitter.com/rdicosmo 5, Rue Thomas Mann F-75205 Paris Cedex 13 FRANCE ------------------------------------------------------------------ Office location: Paris Diderot INRIA Bureau 3020 (3rd floor) Bureau C123 Batiment Sophie Germain Batiment C 8 place Aurélie Nemours 2, Rue Simone Iff Tel: +33 1 80 49 44 42 <01%2080%2049%2044%2042> Metro Bibliotheque F. Mitterrand Ligne 6: Dugommier ligne 14/RER C Ligne 14/RER A: Gare de Lyon ------------------------------------------------------------------ GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

-- Roberto Di Cosmo Director Software Heritage

------------------------------------------------------------------ Professeur (on leave at/detache a INRIA) IRIF email : roberto@dicosmo.org Universite Paris Diderot web : http://www.dicosmo.org Case 7014 Twitter : http://twitter.com/rdicosmo 5, Rue Thomas Mann F-75205 Paris Cedex 13 FRANCE ------------------------------------------------------------------ Office location: Paris Diderot INRIA Bureau 3020 (3rd floor) Bureau C123 Batiment Sophie Germain Batiment C 8 place Aurélie Nemours 2, Rue Simone Iff Tel: +33 1 80 49 44 42 Metro Bibliotheque F. Mitterrand Ligne 6: Dugommier ligne 14/RER C Ligne 14/RER A: Gare de Lyon ------------------------------------------------------------------ GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

François Bérenger · Answer 9 · Tue Nov 28 2017 17:55:15 GMT+0800 (China Standard Time)

I can reproduce the user problem by running concurrently two copies of the same program.
I told each copy to use 8 cores, my machine has 16, but both htop and gkrellm show that only the first 8 cores are busy.

Roberto Di Cosmo · Answer 10 · Tue Nov 28 2017 18:00:19 GMT+0800 (China Standard Time)

Ok, this is a use case that was not expected when we designed the system: the idea was to have only one instance of parmap running on a given system. Using the set_core_mapping function, it is now easy to add a command-line parameter to the binaries that allows to choose the pool of cpus to be used. 2017-11-28 10:55 GMT+01:00 Francois BERENGER <notifications@github.com>:

…

I can reproduce the user problem by running concurrently two copies of the same program. I told each copy to use 8 cores, my machine has 16, but both htop and gkrellm show that only the first 8 cores are busy. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#30 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAp-v93B5kYyWSchgrFii7t3f7fQBSjIks5s69iDgaJpZM4C3PRJ> .

-- Roberto Di Cosmo Director Software Heritage

------------------------------------------------------------------ Professeur (on leave at/detache a INRIA) IRIF email : roberto@dicosmo.org Universite Paris Diderot web : http://www.dicosmo.org Case 7014 Twitter : http://twitter.com/rdicosmo 5, Rue Thomas Mann F-75205 Paris Cedex 13 FRANCE ------------------------------------------------------------------ Office location: Paris Diderot INRIA Bureau 3020 (3rd floor) Bureau C123 Batiment Sophie Germain Batiment C 8 place Aurélie Nemours 2, Rue Simone Iff Tel: +33 1 80 49 44 42 Metro Bibliotheque F. Mitterrand Ligne 6: Dugommier ligne 14/RER C Ligne 14/RER A: Gare de Lyon ------------------------------------------------------------------ GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

François Bérenger · Answer 11 · Tue Nov 28 2017 18:06:52 GMT+0800 (China Standard Time)

if you are interested, I can send other solutions to the problem: the default core pinning policy should be randomized and users should also be able to completely disable core pinning (in addition to users being able to be control freaks, that you already implemented). The current default policy is really bad, especially when you consider multi user systems (like servers).

Roberto Di Cosmo · Answer 12 · Tue Nov 28 2017 19:34:56 GMT+0800 (China Standard Time)

On Tue, Nov 28, 2017 at 10:06:52AM +0000, Francois BERENGER wrote: if you are interested, I can send other solutions to the problem: the default core pinning policy should be randomized

The user of the library can do this now using the mapping vector...

and users should also be able to completely disable core pinning (in addition to users being able to be control freaks, that you already implemented).

Feel free to submit a patch!

The current default policy is really bad, especially when you consider multi user systems (like servers).

Running computational intensive parallel tasks on a multi user system is asking for trouble, but now I believe you have almost everything you need to play this game :-)

…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.*

-- Roberto Di Cosmo

------------------------------------------------------------------ Professeur (on leave at/detache a INRIA) IRIF E-mail : roberto@dicosmo.org Universite Paris Diderot Web : http://www.dicosmo.org Case 7014 Twitter : http://twitter.com/rdicosmo 5, Rue Thomas Mann F-75205 Paris Cedex 13 France ------------------------------------------------------------------ Office location: Paris Diderot INRIA Bureau 3020 (3rd floor) Bureau C123 Batiment Sophie Germain Batiment C 8 place Aurélie Nemours 2, Rue Simone Iff Tel: +33 1 80 49 44 42 Metro Bibliotheque F. Mitterrand Ligne 6: Dugommier ligne 14/RER C Ligne 14/RER A: Gare de Lyon ------------------------------------------------------------------ GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

François Bérenger · Answer 13 · Wed Nov 29 2017 09:28:37 GMT+0800 (China Standard Time)

related to #66

François Bérenger · Answer 14 · Mon Feb 19 2018 10:36:56 GMT+0800 (China Standard Time)

@coti Camille, if you know one way to do core pinning on a multi-user system by asking the OS (Linux at least and maybe OS X also), we might be interested.
From the command line or via some C code.
For example, is there an interface to do:

request_pinnable_cores: the OS would tell us which cores we could pin to
pin_cores: we try to reserve some cores for pinning
unpin_cores: we stop using the cores we have reserved (should be called automatically
if the process exit).

So that several users and programs executing on the same system could do core-pinning without walking on each others' toes.
Currently, my solution is to disable core pinning, but the performance could be better.
Another way would be to have a system call to pin to core, which would fail if another process
is already pinned to that same core.
Thanks,
F.

François Bérenger · Answer 15 · Mon Feb 19 2018 10:44:32 GMT+0800 (China Standard Time)

@JuliaLawall

François Bérenger · Answer 16 · Mon Feb 19 2018 10:45:22 GMT+0800 (China Standard Time)

maybe this is a missing interesting functionality in the kernel

François Bérenger · Answer 17 · Mon Feb 19 2018 11:01:16 GMT+0800 (China Standard Time)

Maybe we'll get some feedback in there:
https://unix.stackexchange.com/questions/425065/linux-how-to-know-which-processes-are-pinned-to-which-core

François Bérenger · Answer 18 · Tue Jul 23 2019 15:10:20 GMT+0800 (China Standard Time)

This was fixed, and this default behavior can be changed by the following call:

Parmap.disable_core_pinning ()