Why does powershell give different result in one-liner than two-liner when converting JSON?

Overview

From a powershell 3 prompt,, I want to call a RESTful service, get some JSON, and pretty-print it. I discovered that if I convert the data to a powershell object, and then convert the powershell object back to json, I get back a nice pretty-printed string. However, if I combine the two conversions into a one-liner with a pipe I will get a different result.

TL;DR: this:

PS> $psobj = $orig | ConvertFrom-JSON
PS> $psobj | ConvertTo-JSON

... gives me different result than this:

PS> $orig | ConvertFrom-JSON | ConvertTo-JSON

Original data

[
  {
    "Type": "1",
    "Name": "QA"
  },
  {
    "Type": "2",
    "Name": "whatver"
  }
]

Doing the conversion in two steps

I'm going to remove the whitespace (so it fits on one line...), convert it to a powershell object, and then convert it back to JSON. This works well, and gives me back the correct data:

PS> $orig = '[{"Type": "1","Name": "QA"},{"Type": "2","Name": "DEV"}]'
PS> $psobj = $orig | ConvertFrom-JSON
PS> $psobj | ConvertTo-JSON
[
    {
        "Type":  "1",
        "Name":  "QA"
    },
    {
        "Type":  "2",
        "Name":  "DEV"
    }
]

Combining the two steps with a pipe

However, if I combine those last two statements into a one-liner, I get a different result:

PS> $orig | ConvertFrom-JSON | ConvertTo-JSON
{
    "value":  [
                  {
                      "Type":  "1",
                      "Name":  "QA"
                  },
                  {
                      "Type":  "2",
                      "Name":  "DEV"
                  }
              ],
    "Count":  2
}

Notice the addition of the keys "value" and "Count". Why is there a difference? I'm sure it has something to do with the desire to return JSON object rather than a JSON array, but I don't understand why the way I do the conversion affects the end result.


Solution 1:

Note: The problem still exists as of Windows PowerShell v5.1, but PowerShell Core (v6+) is not affected.

The existing answers provide an effective workaround - enclosing $orig | ConvertFrom-JSON in (...) - but do not explain the problem correctly; also, the workaround cannot be used in all situations.


As for why use of an intermediate variable did not exhibit the problem:

The in-pipeline distinction between emitting an array's elements one by one vs. the array as a whole (as a single object) is nullified if you collect the output in a variable; e.g., $a = 1, 2 is effectively equivalent to $a = Write-Output -NoEnumerate 1, 2, even though the latter originally emits array 1, 2 as a single object; however, the distinction matters if further pipeline segments process the objects - see below.


The problematic behavior is a combination of two factors:

  • ConvertFrom-Json deviates from normal output behavior by sending arrays as single objects through the pipeline. That is, with a JSON string representing an array, ConvertFrom-Json sends the resulting array of objects as a single object through the pipeline.

    • You can verify ConvertFrom-Json's surprising behavior as follows:

        PS> '[ "one", "two" ]' | ConvertFrom-Json | Get-Member
      
        TypeName: System.Object[]  # !! should be: System.String
        ...
      
    • If ConvertFrom-Json passed its output through the pipeline one by one - as cmdlets normally do - Get-Member would instead return the (distinct) types of the items in the collection, which is [System.String] in this case.

      • Enclosing a command in (...) forces enumeration of its output, which is why ($orig | ConvertFrom-Json) | ConvertTo-Json is an effective workaround.
    • Whether this behavior - which is still present in PowerShell Core too - should be changed is being debated in this GitHub issue.

  • The System.Array type - the base type for all arrays - has a .Count property defined for it via PowerShell's ETS (Extended Type System - see Get-Help about_Types.ps1xml), which causes ConvertTo-Json to include that property in the JSON string it creates, with the array elements included in a sibling value property.

    • This happens only when ConvertTo-Json sees an array as a whole as an input object, as produced by ConvertFrom-Json in this case; e.g., , (1, 2) | ConvertTo-Json surfaces the problem (a nested array whose inner array is sent as a single object), but
      1, 2 | ConvertTo-Json does not (the array elements are sent individually).

    • This ETS-supplied .Count property was effectively obsoleted in PSv3, when arrays implicitly gained a .Count property due to PowerShell now surfacing explicitly implemented interface members as well, which surfaced the ICollection.Count property (additionally, all objects were given an implicit .Count property in an effort to unify the handling of scalars and collections).

    • Sensibly, this ETS property has therefore been removed in PowerShell Core, but is still present in Windows PowerShell v5.1 - see below for a workaround.


Workaround (not needed in PowerShell Core)

Tip of the hat, as many times before, to PetSerAl.

Note: This workaround is PSv3+ by definition, because the Convert*-Json cmdlets were only introduced in v3.

Given that the ETS-supplied .Count property is (a) the cause of the problem and (b) effectively obsolete in PSv3+, the solution is to simply remove it before calling ConvertTo-Json - it is sufficient to do this once in a session, and it should not affect other commands:

 Remove-TypeData System.Array # Remove the redundant ETS-supplied .Count property
 

With that, the extraneous .Count and .value properties should have disappeared:

 PS> '[ "one", "two" ]' | ConvertFrom-Json | ConvertTo-Json
 [
   "one",
   "two"
 ]

The above workaround also fixes the problem for array-valued properties; e.g.:

PS> '' | Select-Object @{ n='prop'; e={ @( 1, 2 ) } } | ConvertTo-Json
{
    "prop":  [
                 1,
                 2
             ]
}

Without the workaround, the value of "prop" would include the extraneous .Count and .value properties as well.

Solution 2:

The solution is to wrap the first two operations with parenthesis:

PS C:\> ($orig | ConvertFrom-JSON) | ConvertTo-JSON
[
    {
        "Type":  "1",
        "Name":  "QA"
    },
    {
        "Type":  "2",
        "Name":  "DEV"
    }
]

The parenthesis allow you to grab the output of the first two operations all at once. Without them, powershell will attempt to parse any objects its gets separately. The collection of PSCustomObject resulting from $orig | ConvertFrom-JSON contains two PSCustomObjects for the 1/QA and 2/DEV pairs, so by piping the output of that collection powershell attempts to handle the key/value pairs one-at-a-time.

Using parenthesis is a shorter way of "grouping" that output and allows you to operate on it without making a variable.

Solution 3:

First off, why is this happening?

PowerShell automatically wraps multiple objects into a collection called a PSMemberSet that has a Count property on it. It's basically how PowerShell manages arbitrary arrays of objects. What's happening is that the Count property is getting added to the resulting JSON, yielding the undesirable results that you're seeing.

We can prove what I just stated above by doing the following:

$Json = @"
[
    {
        "Type":  "1",
        "Name":  "QA"
    },
    {
        "Type":  "2",
        "Name":  "DEV"
    }
]
"@;

# Deserialize the JSON into an array of "PSCustomObject" objects
$Deserialized = ConvertFrom-Json -InputObject $Json;
# Examine the PSBase property of the PowerShell array
# Note the .NET object type name: System.Management.Automation.PSMemberSet
$Deserialized.psbase | Get-Member;

Here is the output from the above

   TypeName: System.Management.Automation.PSMemberSet

Name           MemberType            Definition                                                                                                                                                                      
----           ----------            ----------                                                                                                                                                                      
Add            Method                int IList.Add(System.Object value)                                                                                                                                              
Address        Method                System.Object&, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 Address(int )                                                                       
Clear          Method                void IList.Clear()                      
......
......
Count          Property              int Count {get;}  

You can work around this behavior by referencing the SyncRoot property of the PSMemberSet (which implements the ICollection .NET interface), and passing the value of that property to ConvertTo-Json.

Here is a complete, working example:

$Json = @"
[
    {
        "Type":  "1",
        "Name":  "QA"
    },
    {
        "Type":  "2",
        "Name":  "DEV"
    }
]
"@;

($Json | ConvertFrom-Json) | ConvertTo-Json;

The correct (expected) output will be displayed, similar to the following:

[
    {
        "Type":  "1",
        "Name":  "QA"
    },
    {
        "Type":  "2",
        "Name":  "DEV"
    }
]
[
    {
        "Type":  "1",
        "Name":  "QA"
    },
    {
        "Type":  "2",
        "Name":  "DEV"
    }
]