proxb / PoshRSJob

Provides an alternative to PSjobs with greater performance and less overhead to run commands in the background, freeing up the console and allowing throttling on the jobs.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Caveats of Start-RSJob -ModulesToImport

codykonior opened this issue · comments

PoshRSJob 1.7.2.7 on PowerShell v5.

When you first use Start-RSJob, if you don't specify a -ModulesToImport, then on later runs of Start-RSJob the -ModulesToImport will not work and script blocks depending on those modules will fail.

This is because the runspaces have already been set up without the modules you wanted, and they haven't been torn down.

It's also worth mentioning there's no equivalent of a -Force for this. If you update the module, then it won't be reflected inside jobs in the run spaces (even when you start them later on).

This looks like a side effect of allowing the commands to work with ForEach-Object. I'll have to think about how I want to handle this in either rolling back the change and taking away the ForEach-Object support or figuring out another way to do this.

I'm not sure I want a -Force parameter with Start-RSJob as the users shouldn't have to worry about knowing when to force a new runspacepool to be created due to needing another module imported. At that point, they could just use a new batch name via the -Batch parameter and it would create a new runspacepool and import the module correctly.

This affects more than just -ModulesToImport. -FunctionsToLoad and I suspect (have not tested) that -PSSnapinsToImport will also be affected.

Quick example to show issue.

Function Test {'This is a test'}
Function Test1 {1..10}
1|start-rsjob {Test} -FunctionsToLoad Test
1|start-rsjob {Test1} -FunctionsToLoad Test1

Second job will throw error about function Test1 not being found.

At this point I consider this a breaking change brought on by #75 and plan on rolling it back to remove ForEach-Object support until I can have more time to look at it.

I do not have any issues with this
PSv4, PSv5 tested. with and without profile
note: PoshRSJob installed manually
as far as I see, both jobs have different runspaceid.
I'm also testing modules and also do not find any problems

PS D:\> Import-Module PoshRSJob
PS D:\> Function Test {'This is a test'}
PS D:\> Function Test1 {1..10}
PS D:\> 1|start-rsjob {Test} -FunctionsToLoad Test

Id       Name                 State           HasMoreData  HasErrors    Command
--       ----                 -----           -----------  ---------    -------
1        Job1                 Running         False        False        Test


PS D:\> 1|start-rsjob {Test1} -FunctionsToLoad Test1

Id       Name                 State           HasMoreData  HasErrors    Command
--       ----                 -----           -----------  ---------    -------
2        Job2                 Running         False        False        Test1


PS D:\> Get-RSJob | fl *


HasErrors      : False
Verbose        :
Debug          :
Warning        :
Progress       :
Name           : Job1
ID             : 1
State          : Completed
InputObject    : 1
InstanceID     : b320e1c2-b514-4c20-b0c8-e6c73a1ed1ae
Handle         : System.Management.Automation.PowerShellAsyncResult
Runspace       :
InnerJob       : System.Management.Automation.PowerShell
Finished       : System.Threading.ManualResetEvent
Command        : Test
Error          :
HasMoreData    : True
Output         : {This is a test}
RunspacePoolID : 5aa9db1a-7525-4465-8bb5-19c7d2d0f555
Completed      : True
Batch          : 5aa9db1a-7525-4465-8bb5-19c7d2d0f555

HasErrors      : False
Verbose        :
Debug          :
Warning        :
Progress       :
Name           : Job2
ID             : 2
State          : Completed
InputObject    : 1
InstanceID     : 65f685aa-0a54-4baf-9206-ee250f98693a
Handle         : System.Management.Automation.PowerShellAsyncResult
Runspace       :
InnerJob       : System.Management.Automation.PowerShell
Finished       : System.Threading.ManualResetEvent
Command        : Test1
Error          :
HasMoreData    : True
Output         : {1, 2, 3, 4...}
RunspacePoolID : 287945f8-5d95-4929-8bcb-b2a07dd790e2
Completed      : True
Batch          : 287945f8-5d95-4929-8bcb-b2a07dd790e2



PS D:\> Get-Module PoshRSJob

ModuleType Version    Name                                ExportedCommands
---------- -------    ----                                ----------------
Script     1.7.2.9    PoshRSJob                           {Get-RSJob, Receive-RSJob, Remove-RSJob, Start-RSJob...}


PS D:\>
PS D:\> 1..5|start-rsjob {Test} -FunctionsToLoad Test

Id       Name                 State           HasMoreData  HasErrors    Command
--       ----                 -----           -----------  ---------    -------
63       Job63                Running         False        False        Test
64       Job64                Running         False        False        Test
65       Job65                Running         False        False        Test
66       Job66                Running         False        False        Test
67       Job67                Running         False        False        Test


PS D:\> 1..5|start-rsjob {Test1} -FunctionsToLoad Test1

Id       Name                 State           HasMoreData  HasErrors    Command
--       ----                 -----           -----------  ---------    -------
68       Job68                Running         False        False        Test1
69       Job69                Running         False        False        Test1
70       Job70                Running         False        False        Test1
71       Job71                Running         False        False        Test1
72       Job72                Running         False        False        Test1


PS D:\> 1..5|start-rsjob {Test} -FunctionsToLoad Test

Id       Name                 State           HasMoreData  HasErrors    Command
--       ----                 -----           -----------  ---------    -------
73       Job73                Running         False        False        Test
74       Job74                Running         False        False        Test
75       Job75                Running         False        False        Test
76       Job76                Running         False        False        Test
77       Job77                Running         False        False        Test


PS D:\> 1..5|start-rsjob {Test1} -FunctionsToLoad Test1

Id       Name                 State           HasMoreData  HasErrors    Command
--       ----                 -----           -----------  ---------    -------
78       Job78                Running         False        False        Test1
79       Job79                Running         False        False        Test1
80       Job80                Running         False        False        Test1
81       Job81                Running         False        False        Test1
82       Job82                Running         False        False        Test1


PS D:\> Get-RSJob | select Command, Batch

Command Batch
------- -----
Test    ce18a3ca-8c46-4dae-a587-49b1008ff3fe
Test    ce18a3ca-8c46-4dae-a587-49b1008ff3fe
Test    ce18a3ca-8c46-4dae-a587-49b1008ff3fe
Test    ce18a3ca-8c46-4dae-a587-49b1008ff3fe
Test    ce18a3ca-8c46-4dae-a587-49b1008ff3fe
Test1   53d1f458-1260-48f2-8196-e870f351a5a2
Test1   53d1f458-1260-48f2-8196-e870f351a5a2
Test1   53d1f458-1260-48f2-8196-e870f351a5a2
Test1   53d1f458-1260-48f2-8196-e870f351a5a2
Test1   53d1f458-1260-48f2-8196-e870f351a5a2
Test    0fe9b702-d280-4ab8-9ca6-3dad247fdecf
Test    0fe9b702-d280-4ab8-9ca6-3dad247fdecf
Test    0fe9b702-d280-4ab8-9ca6-3dad247fdecf
Test    0fe9b702-d280-4ab8-9ca6-3dad247fdecf
Test    0fe9b702-d280-4ab8-9ca6-3dad247fdecf
Test1   843ad858-2f6e-410c-ac11-a24c2e08ea7d
Test1   843ad858-2f6e-410c-ac11-a24c2e08ea7d
Test1   843ad858-2f6e-410c-ac11-a24c2e08ea7d
Test1   843ad858-2f6e-410c-ac11-a24c2e08ea7d
Test1   843ad858-2f6e-410c-ac11-a24c2e08ea7d
PS D:\>

It's working now because you are using the latest version (1.7.2.9) that fixed it. If you download 1.7.2.7, then there would be issues with your first example. I pretty much know where the issue is happening at in the code. I define the module/pssnapin/function point before I get to the runspacepools and if an existing one is used, then none of these things are loaded. I just need to load everything after I pick a runspacepool (either a new one or existing) and then attempt to add these into the mix. It definitely needs testing because I don't know yet if it will actually work. :)

Yes, I need to read releasenotes....

but... You comment out both code branches - with and without -Batch
but with batch you know what you do, if runspacepool with selected batch name already exists, you reuse it and its fine when you ignore -modulestoload - you alreay load modules when you create batch (line 448)
thus the bug actually present when you do not use batch but still use the same runtimepool - lines 399-407

I think that Start-RSJob should distinguish between pipeline and several independent launches and reuse runspace pool only when it detect pipeline input

the best fix I know - generate batch name in begin {} block :)

begin {
If (-NOT $PSBoundParameters.ContainsKey('Batch')) {
                $Batch = $RunspacePoolID = [guid]::NewGuid().ToString()
}
}

and do not test for $PSBoundParameters.ContainsKey('Batch') inside process

I'm pretty sure I know how to solve the issue for Modules/Functions etc... in regards to making sure that they are updated properly; just need to get some time to update the code.

Distinguishing the pipeline support for the jobs can be checked fairly simply by first looking at the named parameter for InputObject and then looking at the Process block. The big thing is making sure that we handle not only ForEach-Object, but also the ForEach keyword and even using other loops like While to ensure that all bases are covered. I started looking at using Get-PSCallstack as a means to determine whether or not I could accurately see if there was any sort of looping going on, but wasn't too impressed with the initial results.

Get-PSCallstack has very poor performance. It's best not to call it in any looping code like a Process block.

Yep. That and you would still have to parse the code to determine if the command was used in a loop. Although this wouldn't be used in a Process block and would have instead been done in the End block once to determine if it is in a loop or not.

You want to support RunspacePool reusing automatically inside any loops ? I find it overkill feature because user can use -Batch.
and... imagine code like this

foreach ($i in 1..10)
{
  if ($i%2) {
    Start-RSJob { OddFunction() } -FunctionsToImport OddFunction
  } else {
   Start-RSJob { EvenFunction() }  -FunctionsToImport EvenFunction
  }
}

Are you sure user want to run both functions in the same Runspace ? I''m not

I'm think than the one only loop that can guarantee that user want it run in the same runspace - is pipeline. All other - user desision. To use or not to use -Batch parameter.

You want to support RunspacePool reusing automatically inside any loops ? I find it overkill feature because user can use -Batch.
I think the entire idea behind all of this was to support something like this:

1..10 | ForEach [
    $_|Start-RSJob -Name {$_} -ScriptBlock {$_}
}

So I guess I would answer that with a yes. But in hindsight, I really should have never bothered trying to wedge the type of support that I had with anything other than directly piping the input into the command.

If a person wanted to use a foreach loop with the command, then they need to know that each command will create a new runspacepool (and lose any use of throttling) unless they use the -Batch parameter (as you mention). I would consider that more of a user education to understand how to use the command and parameter to ensure that the module behaves as intended.

Some good feedback and this does fit in with what I spent some of the day working on to ensure that only -Batch use will reuse the runspacepool and have it autogenerated in the Begin block.

If you do desision about runspace reusing on code analysis then you probably can't catch code like this

$BigData | ForEach [
    if ( Test-SomeHardDecision $_ ) {
      $_ | Start-RSJob -ScriptBlock { TestFunction } -ModulesToImport 'C:\Debug\MyModule'
    }
    else {
       $_ | Start-RSJob -ScriptBlock { TestFunction } -ModulesToImport 'C:\Production\MyModule'
   }
}

and user will be disappointed on result because it will be breaking change with current (and past) module behaviour

Both your and my examples have point of management with -Batch parameter
But I believe that decision to reuse or not reuse pool can be only on user side or when invoke contract stable. And only pipeline can guarantee that it stable because all parameters bounded just one time on call and other loops bound parameters on each invoke and user may want to mutate it