Replace CRLF using powershell
Editor's note: Judging by later comments by the OP, the gist of this question is: How can you convert a file with CRLF (Windows-style) line endings to a LF-only (Unix-style) file in PowerShell?
Here is my powershell script:
$original_file ='C:\Users\abc\Desktop\File\abc.txt'
(Get-Content $original_file) | Foreach-Object {
$_ -replace "'", "2"`
-replace '2', '3'`
-replace '1', '7'`
-replace '9', ''`
-replace "`r`n",'`n'
} | Set-Content "C:\Users\abc\Desktop\File\abc.txt" -Force
With this code i am able to replace 2 with 3, 1 with 7 and 9 with an empty string. I am unable to replace the carriage return line feed with just the line feed. But this doesnt work.
This is a state-of-the-union answer as of Windows PowerShell v5.1 / PowerShell Core v6.2.0:
Andrew Savinykh's ill-fated answer, despite being the accepted one, is, as of this writing, fundamentally flawed (I do hope it gets fixed - there's enough information in the comments - and in the edit history - to do so).
Ansgar Wiecher's helpful answer works well, but requires direct use of the .NET Framework (and reads the entire file into memory, though that could be changed). Direct use of the .NET Framework is not a problem per se, but is harder to master for novices and hard to remember in general.
A future version of PowerShell Core will have a
Convert-TextFile
cmdlet with a-LineEnding
parameter to allow in-place updating of text files with a specific newline style, as being discussed on GitHub.
In PSv5+, PowerShell-native solutions are now possible, because Set-Content
now supports the -NoNewline
switch, which prevents undesired appending of a platform-native newline[1]
:
# Convert CRLFs to LFs only.
# Note:
# * (...) around Get-Content ensures that $file is read *in full*
# up front, so that it is possible to write back the transformed content
# to the same file.
# * + "`n" ensures that the file has a *trailing LF*, which Unix platforms
# expect.
((Get-Content $file) -join "`n") + "`n" | Set-Content -NoNewline $file
The above relies on Get-Content
's ability to read a text file that uses any combination of CR-only, CRLF, and LF-only newlines line by line.
Caveats:
-
You need to specify the output encoding to match the input file's in order to recreate it with the same encoding. The command above does NOT specify an output encoding; to do so, use
-Encoding
; without-Encoding
:- In Windows PowerShell, you'll get "ANSI" encoding, your system's single-byte, 8-bit legacy encoding, such as Windows-1252 on US-English systems.
- In PowerShell Core, you'll get UTF-8 encoding without a BOM.
The input file's content as well as its transformed copy must fit into memory as a whole, which can be problematic with large input files.
There's a risk of file corruption, if the process of writing back to the input file gets interrupted.
[1] In fact, if there are multiple strings to write, -NoNewline
also doesn't place a newline between them; in the case at hand, however, this is irrelevant, because only one string is written.
You have not specified the version, I'm assuming you are using Powershell v3.
Try this:
$path = "C:\Users\abc\Desktop\File\abc.txt"
(Get-Content $path -Raw).Replace("`r`n","`n") | Set-Content $path -Force
Editor's note: As mike z points out in the comments, Set-Content
appends a trailing CRLF, which is undesired. Verify with: 'hi' > t.txt; (Get-Content -Raw t.txt).Replace("`r`n","`n") | Set-Content t.txt; (Get-Content -Raw t.txt).EndsWith("`r`n")
, which yields $True
.
Note this loads the whole file in memory, so you might want a different solution if you want to process huge files.
UPDATE
This might work for v2 (sorry nowhere to test):
$in = "C:\Users\abc\Desktop\File\abc.txt"
$out = "C:\Users\abc\Desktop\File\abc-out.txt"
(Get-Content $in) -join "`n" > $out
Editor's note: Note that this solution (now) writes to a different file and is therefore not equivalent to the (still flawed) v3 solution. (A different file is targeted to avoid the pitfall Ansgar Wiechers points out in the comments: using >
truncates the target file before execution begins). More importantly, though: this solution too appends a trailing CRLF, which may be undesired. Verify with 'hi' > t.txt; (Get-Content t.txt) -join "`n" > t.NEW.txt; [io.file]::ReadAllText((Convert-Path t.NEW.txt)).endswith("`r`n")
, which yields $True
.
Same reservation about being loaded to memory though.
Alternative solution that won't append a spurious CR-LF:
$original_file ='C:\Users\abc\Desktop\File\abc.txt'
$text = [IO.File]::ReadAllText($original_file) -replace "`r`n", "`n"
[IO.File]::WriteAllText($original_file, $text)
Adding another version based on example above by @ricky89 and @mklement0 with few improvements:
Script to process:
- *.txt files in the current folder
- replace LF with CRLF (Unix to Windows line-endings)
- save resulting files to CR-to-CRLF subfolder
- tested on 100MB+ files, PS v5;
LF-to-CRLF.ps1:
# get current dir
$currentDirectory = Split-Path $MyInvocation.MyCommand.Path -Parent
# create subdir CR-to-CRLF for new files
$outDir = $(Join-Path $currentDirectory "CR-to-CRLF")
New-Item -ItemType Directory -Force -Path $outDir | Out-Null
# get all .txt files
Get-ChildItem $currentDirectory -Force | Where-Object {$_.extension -eq ".txt"} | ForEach-Object {
$file = New-Object System.IO.StreamReader -Arg $_.FullName
# Resulting file will be in CR-to-CRLF subdir
$outstream = [System.IO.StreamWriter] $(Join-Path $outDir $($_.BaseName + $_.Extension))
$count = 0
# read line by line, replace CR with CRLF in each by saving it with $outstream.WriteLine
while ($line = $file.ReadLine()) {
$count += 1
$outstream.WriteLine($line)
}
$file.close()
$outstream.close()
Write-Host ("$_`: " + $count + ' lines processed.')
}
Below is my script for converting all files recursively. You can specify folders or files to exclude.
$excludeFolders = "node_modules|dist|.vs";
$excludeFiles = ".*\.map.*|.*\.zip|.*\.png|.*\.ps1"
Function Dos2Unix {
[CmdletBinding()]
Param([Parameter(ValueFromPipeline)] $fileName)
Write-Host -Nonewline "."
$fileContents = Get-Content -raw $fileName
$containsCrLf = $fileContents | %{$_ -match "\r\n"}
If($containsCrLf -contains $true)
{
Write-Host "`r`nCleaing file: $fileName"
set-content -Nonewline -Encoding utf8 $fileName ($fileContents -replace "`r`n","`n")
}
}
Get-Childitem -File "." -Recurse |
Where-Object {$_.PSParentPath -notmatch $excludeFolders} |
Where-Object {$_.PSPath -notmatch $excludeFiles} |
foreach { $_.PSPath | Dos2Unix }