Escape command line arguments in c#
Solution 1:
It's more complicated than that though!
I was having related problem (writing front-end .exe that will call the back-end with all parameters passed + some extra ones) and so i looked how people do that, ran into your question. Initially all seemed good doing it as you suggest arg.Replace (@"\", @"\\").Replace(quote, @"\"+quote)
.
However when i call with arguments c:\temp a\\b
, this gets passed as c:\temp
and a\\b
, which leads to the back-end being called with "c:\\temp" "a\\\\b"
- which is incorrect, because there that will be two arguments c:\\temp
and a\\\\b
- not what we wanted! We have been overzealous in escapes (windows is not unix!).
And so i read in detail http://msdn.microsoft.com/en-us/library/system.environment.getcommandlineargs.aspx and it actually describes there how those cases are handled: backslashes are treated as escape only in front of double quote.
There is a twist to it in how multiple \
are handled there, the explanation can leave one dizzy for a while. I'll try to re-phrase said unescape rule here: say we have a substring of N \
, followed by "
. When unescaping, we replace that substring with int(N/2) \
and iff N was odd, we add "
at the end.
The encoding for such decoding would go like that: for an argument, find each substring of 0-or-more \
followed by "
and replace it by twice-as-many \
, followed by \"
. Which we can do like so:
s = Regex.Replace(arg, @"(\\*)" + "\"", @"$1$1\" + "\"");
That's all...
PS. ... not. Wait, wait - there is more! :)
We did the encoding correctly but there is a twist because you are enclosing all parameters in double-quotes (in case there are spaces in some of them). There is a boundary issue - in case a parameter ends on \
, adding "
after it will break the meaning of closing quote. Example c:\one\ two
parsed to c:\one\
and two
then will be re-assembled to "c:\one\" "two"
that will me (mis)understood as one argument c:\one" two
(I tried that, i am not making it up). So what we need in addition is to check if argument ends on \
and if so, double the number of backslashes at the end, like so:
s = "\"" + Regex.Replace(s, @"(\\+)$", @"$1$1") + "\"";
Solution 2:
My answer was similar to Nas Banov's answer but I wanted double quotes only if necessary.
Cutting out extra unnecessary double quotes
My code saves unnecessarily putting double quotes around it all the time which is important *when you are getting up close to the character limit for parameters.
/// <summary>
/// Encodes an argument for passing into a program
/// </summary>
/// <param name="original">The value that should be received by the program</param>
/// <returns>The value which needs to be passed to the program for the original value
/// to come through</returns>
public static string EncodeParameterArgument(string original)
{
if( string.IsNullOrEmpty(original))
return original;
string value = Regex.Replace(original, @"(\\*)" + "\"", @"$1\$0");
value = Regex.Replace(value, @"^(.*\s.*?)(\\*)$", "\"$1$2$2\"");
return value;
}
// This is an EDIT
// Note that this version does the same but handles new lines in the arugments
public static string EncodeParameterArgumentMultiLine(string original)
{
if (string.IsNullOrEmpty(original))
return original;
string value = Regex.Replace(original, @"(\\*)" + "\"", @"$1\$0");
value = Regex.Replace(value, @"^(.*\s.*?)(\\*)$", "\"$1$2$2\"", RegexOptions.Singleline);
return value;
}
explanation
To escape the backslashes and double quotes correctly you can just replace any instances of multiple backslashes followed by a single double quote with:
string value = Regex.Replace(original, @"(\\*)" + "\"", @"\$1$0");
An extra twice the original backslashes + 1 and the original double quote. i.e., '\' + originalbackslashes + originalbackslashes + '"'. I used $1$0 since $0 has the original backslashes and the original double quote so it makes the replacement a nicer one to read.
value = Regex.Replace(value, @"^(.*\s.*?)(\\*)$", "\"$1$2$2\"");
This can only ever match an entire line that contains a whitespace.
If it matches then it adds double quotes to the beginning and end.
If there was originally backslashes on the end of the argument they will not have been quoted, now that there is a double quote on the end they need to be. So they are duplicated, which quotes them all, and prevents unintentionally quoting the final double quote
It does a minimal matching for the first section so that the last .*? doesn't eat into matching the final backslashes
Output
So these inputs produce the following outputs
hello
hello
\hello\12\3\
\hello\12\3\
hello world
"hello world"
\"hello\"
\\"hello\\\"
\"hello\ world
"\\"hello\ world"
\"hello\\\ world\
"\\"hello\\\ world\\"
hello world\\
"hello world\\\\"
Solution 3:
I have ported a C++ function from the Everyone quotes command line arguments the wrong way article.
It works fine, but you should note that cmd.exe
interprets command line differently. If (and only if, like the original author of article noted) your command line will be interpreted by cmd.exe
you should also escape shell metacharacters.
/// <summary>
/// This routine appends the given argument to a command line such that
/// CommandLineToArgvW will return the argument string unchanged. Arguments
/// in a command line should be separated by spaces; this function does
/// not add these spaces.
/// </summary>
/// <param name="argument">Supplies the argument to encode.</param>
/// <param name="force">
/// Supplies an indication of whether we should quote the argument even if it
/// does not contain any characters that would ordinarily require quoting.
/// </param>
private static string EncodeParameterArgument(string argument, bool force = false)
{
if (argument == null) throw new ArgumentNullException(nameof(argument));
// Unless we're told otherwise, don't quote unless we actually
// need to do so --- hopefully avoid problems if programs won't
// parse quotes properly
if (force == false
&& argument.Length > 0
&& argument.IndexOfAny(" \t\n\v\"".ToCharArray()) == -1)
{
return argument;
}
var quoted = new StringBuilder();
quoted.Append('"');
var numberBackslashes = 0;
foreach (var chr in argument)
{
switch (chr)
{
case '\\':
numberBackslashes++;
continue;
case '"':
// Escape all backslashes and the following
// double quotation mark.
quoted.Append('\\', numberBackslashes*2 + 1);
quoted.Append(chr);
break;
default:
// Backslashes aren't special here.
quoted.Append('\\', numberBackslashes);
quoted.Append(chr);
break;
}
numberBackslashes = 0;
}
// Escape all backslashes, but let the terminating
// double quotation mark we add below be interpreted
// as a metacharacter.
quoted.Append('\\', numberBackslashes*2);
quoted.Append('"');
return quoted.ToString();
}
Solution 4:
I was running into issues with this, too. Instead of unparsing args, I went with taking the full original commandline and trimming off the executable. This had the additional benefit of keeping whitespace in the call, even if it isn't needed/used. It still has to chase escapes in the executable, but that seemed easier than the args.
var commandLine = Environment.CommandLine;
var argumentsString = "";
if(args.Length > 0)
{
// Re-escaping args to be the exact same as they were passed is hard and misses whitespace.
// Use the original command line and trim off the executable to get the args.
var argIndex = -1;
if(commandLine[0] == '"')
{
//Double-quotes mean we need to dig to find the closing double-quote.
var backslashPending = false;
var secondDoublequoteIndex = -1;
for(var i = 1; i < commandLine.Length; i++)
{
if(backslashPending)
{
backslashPending = false;
continue;
}
if(commandLine[i] == '\\')
{
backslashPending = true;
continue;
}
if(commandLine[i] == '"')
{
secondDoublequoteIndex = i + 1;
break;
}
}
argIndex = secondDoublequoteIndex;
}
else
{
// No double-quotes, so args begin after first whitespace.
argIndex = commandLine.IndexOf(" ", System.StringComparison.Ordinal);
}
if(argIndex != -1)
{
argumentsString = commandLine.Substring(argIndex + 1);
}
}
Console.WriteLine("argumentsString: " + argumentsString);