Backslash and quote in command-line arguments
Is the following behaviour some feature or a bug in C# .NET?
Test application:
using System;
using System.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Arguments:");
foreach (string arg in args)
{
Console.WriteLine(arg);
}
Console.WriteLine();
Console.WriteLine("Command Line:");
var clArgs = Environment.CommandLine.Split(' ');
foreach (string arg in clArgs.Skip(clArgs.Length - args.Length))
{
Console.WriteLine(arg);
}
Console.ReadKey();
}
}
}
Run it with command line arguments:
a "b" "\\x\\" "\x\"
In the result I receive:
Arguments:
a
b
\\x\
\x"
Command Line:
a
"b"
"\\x\\"
"\x\"
There are missing backslashes and non-removed quote in args passed to method Main(). What is the correct workaround except manually parsing Environment.CommandLine
?
Solution 1:
According to this article by Jon Galloway, there can be weird behaviour experienced when using backslashes in command line arguments.
Most notably it mentions that "Most applications (including .NET applications) use CommandLineToArgvW to decode their command lines. It uses crazy escaping rules which explain the behaviour you're seeing."
It explains that the first set of backslashes do not require escaping, but backslashes coming after alpha (maybe numeric too?) characters require escaping and that quotes always need to be escaped.
Based off of these rules, I believe to get the arguments you want you would have to pass them as:
a "b" "\\x\\\\" "\x\\"
"Whacky" indeed.
The full story of the crazy escaping rules was told in 2011 by an MS blog entry: "Everyone quotes command line arguments the wrong way"
Raymond also had something to say on the matter (already back in 2010): "What's up with the strange treatment of quotation marks and backslashes by CommandLineToArgvW"
The situation persists into 2020 and the escaping rules described in Everyone quotes command line arguments the wrong way are still correct as of 2020 and Windows 10.
Solution 2:
I came across this same issue the other day and had a tough time getting through it. In my googling, I came across this article regarding VB.NET (the language of my application) that solved the problem without having to change any of my other code based on the arguments.
In that article, he refers to the original article which was written for C#. Here's the actual code, you pass it Environment.CommandLine()
:
C#
class CommandLineTools
{
/// <summary>
/// C-like argument parser
/// </summary>
/// <param name="commandLine">Command line string with arguments. Use Environment.CommandLine</param>
/// <returns>The args[] array (argv)</returns>
public static string[] CreateArgs(string commandLine)
{
StringBuilder argsBuilder = new StringBuilder(commandLine);
bool inQuote = false;
// Convert the spaces to a newline sign so we can split at newline later on
// Only convert spaces which are outside the boundries of quoted text
for (int i = 0; i < argsBuilder.Length; i++)
{
if (argsBuilder[i].Equals('"'))
{
inQuote = !inQuote;
}
if (argsBuilder[i].Equals(' ') && !inQuote)
{
argsBuilder[i] = '\n';
}
}
// Split to args array
string[] args = argsBuilder.ToString().Split(new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries);
// Clean the '"' signs from the args as needed.
for (int i = 0; i < args.Length; i++)
{
args[i] = ClearQuotes(args[i]);
}
return args;
}
/// <summary>
/// Cleans quotes from the arguments.<br/>
/// All signle quotes (") will be removed.<br/>
/// Every pair of quotes ("") will transform to a single quote.<br/>
/// </summary>
/// <param name="stringWithQuotes">A string with quotes.</param>
/// <returns>The same string if its without quotes, or a clean string if its with quotes.</returns>
private static string ClearQuotes(string stringWithQuotes)
{
int quoteIndex;
if ((quoteIndex = stringWithQuotes.IndexOf('"')) == -1)
{
// String is without quotes..
return stringWithQuotes;
}
// Linear sb scan is faster than string assignemnt if quote count is 2 or more (=always)
StringBuilder sb = new StringBuilder(stringWithQuotes);
for (int i = quoteIndex; i < sb.Length; i++)
{
if (sb[i].Equals('"'))
{
// If we are not at the last index and the next one is '"', we need to jump one to preserve one
if (i != sb.Length - 1 && sb[i + 1].Equals('"'))
{
i++;
}
// We remove and then set index one backwards.
// This is because the remove itself is going to shift everything left by 1.
sb.Remove(i--, 1);
}
}
return sb.ToString();
}
}
VB.NET:
Imports System.Text
' Original version by Jonathan Levison (C#)'
' http://sleepingbits.com/2010/01/command-line-arguments-with-double-quotes-in-net/
' converted using http://www.developerfusion.com/tools/convert/csharp-to-vb/
' and then some manual effort to fix language discrepancies
Friend Class CommandLineHelper
''' <summary>
''' C-like argument parser
''' </summary>
''' <param name="commandLine">Command line string with arguments. Use Environment.CommandLine</param>
''' <returns>The args[] array (argv)</returns>
Public Shared Function CreateArgs(commandLine As String) As String()
Dim argsBuilder As New StringBuilder(commandLine)
Dim inQuote As Boolean = False
' Convert the spaces to a newline sign so we can split at newline later on
' Only convert spaces which are outside the boundries of quoted text
For i As Integer = 0 To argsBuilder.Length - 1
If argsBuilder(i).Equals(""""c) Then
inQuote = Not inQuote
End If
If argsBuilder(i).Equals(" "c) AndAlso Not inQuote Then
argsBuilder(i) = ControlChars.Lf
End If
Next
' Split to args array
Dim args As String() = argsBuilder.ToString().Split(New Char() {ControlChars.Lf}, StringSplitOptions.RemoveEmptyEntries)
' Clean the '"' signs from the args as needed.
For i As Integer = 0 To args.Length - 1
args(i) = ClearQuotes(args(i))
Next
Return args
End Function
''' <summary>
''' Cleans quotes from the arguments.<br/>
''' All signle quotes (") will be removed.<br/>
''' Every pair of quotes ("") will transform to a single quote.<br/>
''' </summary>
''' <param name="stringWithQuotes">A string with quotes.</param>
''' <returns>The same string if its without quotes, or a clean string if its with quotes.</returns>
Private Shared Function ClearQuotes(stringWithQuotes As String) As String
Dim quoteIndex As Integer = stringWithQuotes.IndexOf(""""c)
If quoteIndex = -1 Then Return stringWithQuotes
' Linear sb scan is faster than string assignemnt if quote count is 2 or more (=always)
Dim sb As New StringBuilder(stringWithQuotes)
Dim i As Integer = quoteIndex
Do While i < sb.Length
If sb(i).Equals(""""c) Then
' If we are not at the last index and the next one is '"', we need to jump one to preserve one
If i <> sb.Length - 1 AndAlso sb(i + 1).Equals(""""c) Then
i += 1
End If
' We remove and then set index one backwards.
' This is because the remove itself is going to shift everything left by 1.
sb.Remove(System.Math.Max(System.Threading.Interlocked.Decrement(i), i + 1), 1)
End If
i += 1
Loop
Return sb.ToString()
End Function
End Class
Solution 3:
I have escaped the problem the other way...
Instead of getting arguments already parsed I am getting the arguments string as it is and then I am using my own parser:
static void Main(string[] args)
{
var param = ParseString(Environment.CommandLine);
...
}
// The following template implements the following notation:
// -key1 = some value -key2 = "some value even with '-' character " ...
private const string ParameterQuery = "\\-(?<key>\\w+)\\s*=\\s*(\"(?<value>[^\"]*)\"|(?<value>[^\\-]*))\\s*";
private static Dictionary<string, string> ParseString(string value)
{
var regex = new Regex(ParameterQuery);
return regex.Matches(value).Cast<Match>().ToDictionary(m => m.Groups["key"].Value, m => m.Groups["value"].Value);
}
This concept lets you type quotes without the escape prefix.