Select-Object -First affects prior cmdlet in the pipeline
The PowerShell Strongly Encouraged Development Guidelines
that cmdlets should Implement for the Middle of a Pipeline but I suspect that isn't doable for a parameter as -Last
for the Select-Object
. Simply because you can't determine the last entry upfront. In other words: you will need to wait for the input stream to finish until you define the last entry.
To prove this, I wrote a little script:
$Data = 1..5 | ForEach-Object {[pscustomobject]@{Index = "$_"}}
$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object -Last 5 | ForEach-Object { Write-Host 'After' $_.Index }
and compared this to Select-Object *
:
$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object * | ForEach-Object { Write-Host 'After' $_.Index }
With results (right: Select-Object -Last 5
, left: Select-Object *
):
-Last 5 *
------- -
Before 1 Before 1
Before 2 After 1
Before 3 Before 2
Before 4 After 2
Before 5 Before 3
After 1 After 3
After 2 Before 4
After 3 After 4
After 4 Before 5
After 5 After 5
Despite this isn't documented I think that I can conclude from this that the -Last
parameter indeed chokes the pipeline.
This is not a big deal, but I also tested it against the -First
parameter and got some disturbing results. To better show this, I am not selecting all the objects but just the **-First 2**
:
$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object -First 2 | ForEach-Object { Write-Host 'After' $_.Index }
Before 1
After 1
Before 2
After 2
Note that with the -First 2
parameter not only the following cmdlet shows two objects but also the preceding cmdlet (ForEach-Object { Write-Host 'Before' $_.Index; $_ }
) shows only 2 objects (instead of 5).
Apparently, the -First
parameter references directly into the object of the prior cmdlet which is different then e.g. using the -Last 2
parameter:
$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object -Last 2 | ForEach-Object { Write-Host 'After' $_.Index }
Before 1
Before 2
Before 3
Before 4
Before 5
After 4
After 5
This also happens when using the Out-Host
instead of the Write-Host
cmdlet or sending the results to a variable, like:
$Before = ""; $After = ""
$Data | ForEach-Object { $Before += $_.Index; $_ } | Select-Object -First 2 | ForEach-Object { $After += $_.Index }
$Before
$After
This shows on both Windows Powershell (5.1.18362.628
) and PowerShell Core (7.0.0
).
Is this a bug?
Solution 1:
Select-Object
affects the upstream commands by cheating
That might sound like a joke, but it's not.
To optimize pipeline streaming performance, Select-Object
uses a trick not available to a regular user developing a Cmdlet
- it throws a StopUpstreamCommandsException
.
Once caught, the runtime (indirectly) calls StopProcessing()
on all the preceding commands, but does not treat it as a terminating error event, allowing the downstream cmdlets to continue executing.
This is extremely useful when you have slow or computationally heavy command early in a pipeline:
# this will only take ~3 seconds to return with the StopUpstreamCommand behavior
# but would have incurred 8 extra seconds of "waiting to discard" otherwise
Measure-Command {
1..5 |ForEach-Object { Start-Sleep -Seconds 1; $_ } |Select-Object -First 3
}