What does Get-Content really output, a string or several properties?
What does Get-Content
really output, a string or an object with several properties? If it's an object, which property contains the string, and can I modify that property? The file is only one line so I don't have to deal with an object array.
PS C:\Users\me> Get-Content input.txt 111 PS C:\Users\me> Get-Content input.txt | Select-Object * PSPath : C:\Users\me\input.txt PSParentPath : C:\Users\me PSChildName : input PSDrive : C PSProvider : Microsoft.PowerShell.Core\FileSystem ReadCount : 1 Length : 3 PS C:\Users\me> (Get-Content input.txt).GetType() IsPublic IsSerial Name BaseType -------- -------- ---- -------- True True String System.Object
-replace
erases all of the extra properties except length:
PS C:\Users\me> (Get-Content input.txt) -replace 111, 222 | Select-Object * Length ------ 3
Solution 1:
-
Get-Content
outputs an array[1] of .NET[string]
objects, each representing a line from the input file. -
However,
Get-Content
also decorates these string objects with additional properties that provide helpful metadata about where the strings came from. -
All data-retrieving PowerShell provider cmdlets -
Get-ChildItem
,Get-Item
,Get-Content
, ... - do this; specifically they decorate their output objects with the followingNoteProperty
members:PSPath
,PSParentPath
,PSChildName
,PSDrive
,PSProvider
. -
This decorating is made possible by a PowerShell-specific (mostly) invisible helper type,
[psobject]
, in which .NET objects are wrapped, and which can store additional properties. -
Typically, a decorated .NET object behaves as it normally would; e.g., outputting the strings returned by
Get-Content
prints them as-is, with no indicating that they now contain additional properties (from PowerShell's perspective). -
When it comes to serialization, however, or making a
[pscustomobject]
clone of an object withSelect-Object *
, or other contexts where reflection (enumeration of properties) is used, these properties do matter and do surface, as shown in your question.-
When you serialize with
ConvertTo-Json
, for instance, PowerShell also serializes the additional properties and - somewhat artificially - represents the string's content as pseudo propertyvalue
- see this answer. -
Accessing
.psobject.BaseObject
on a decorated[string]
instance - only[2] - allows you to bypass the PowerShell-added properties on demand, returning just the underlying .NET[string]
instance (the base object).
-
-
The
NoteProperty
members added by provider cmdlets such asGet-Content
are instance members, i.e., they are specific to a given object. (PowerShell's ETS (Extended Type System) also allows you to decorate objects at the type level).-
Therefore, when you construct a new
[string]
instance, such as by using the-replace
operator (or, generally, any string operation that returns a new string), the new instance does not have the additional properties -
From the perspective of a string operation, even a decorated string is just a string; the added properties are irrelevant in this context. This pattern applies generally: a decorated object can always act as if it were its .NET base object, and in most contexts does.
-
[1] Strictly speaking, Get-Content
, as most cmdlets do, streams its output objects, i.e. it emits them to the pipeline one by one. Only when you capture this output (such as by assigning to a variable) is an array ([object[]]
) constructed on demand, assuming the input file contains two or more lines - otherwise, the one and only line is captured as itself, not wrapped in an array - see this answer for more information.
[2] In PowerShell version 3 and higher, instance ETS members are no longer associated with a given .NET base object's [psobject]
wrapper, but with the base object itself, using so-called resurrection tables. [string]
instances are the only exception, for technical reasons, which is why only with [string]
instances do you truly get an undecorated object with .psobject.BaseObject
- for all other types, accessing the base object is effectively a no-op, as the object returned will still surface instance ETS members (too).
However, .psobject.BaseObject
can still be useful for accessing members (properties, methods) of the base object, in case these members are shadowed (overridden) by ETS members of the same name.
Solution 2:
The Get-Content
cmdlet without the -Raw
switch returns a string array containing all the lines of the text file, split on newline characters.
Using Get-Content
with the -Raw
switch, returns the content of the file as a single string, including the newline characters.
Having said that, if (as in your case) the text file contains only one line, PowerShell automatically 'unboxes' the returned array and returns a single string.
Strings in .NET are Objects and have properties and methods.
Try "abc" | Get-Member
to see what else is there to be found regarding the String object.
If you do "abc" | Select-Object *
, the output will be
Length ------ 3
PowerShell now only displays the Length
property of the string.
Get-Content
however adds extra information to the resulting (string) object as Ansgar Wiechers already commented, that all have to do with the file itself. (PSPath, PSParentPath, PSChildName, PSDrive, PSProvider, ReadCount
). These extra properties are all added as NoteProperty
and you can see they are added if you do
(Get-Content input.txt) | Get-Member
This returns an impressive list, but the object is still of type System.String.
When you do a -replace
action on that string with (Get-Content input.txt) -replace 111, 222
, the result will be New String object that does not have the extra file-related properties anymore. Now it is just a string and as such has nothing to do with where it originally came from.
Hope that explains