Split string containing command-line parameters into string[] in C#
I have a single string that contains the command-line parameters to be passed to another executable and I need to extract the string[] containing the individual parameters in the same way that C# would if the commands had been specified on the command-line. The string[] will be used when executing another assemblies entry-point via reflection.
Is there a standard function for this? Or is there a preferred method (regex?) for splitting the parameters correctly? It must handle '"' delimited strings that may contain spaces correctly, so I can't just split on ' '.
Example string:
string parameterString = @"/src:""C:\tmp\Some Folder\Sub Folder"" /users:""[email protected]"" tasks:""SomeTask,Some Other Task"" -someParam foo";
Example result:
string[] parameterArray = new string[] {
@"/src:C:\tmp\Some Folder\Sub Folder",
@"/users:[email protected]",
@"tasks:SomeTask,Some Other Task",
@"-someParam",
@"foo"
};
I do not need a command-line parsing library, just a way to get the String[] that should be generated.
Update: I had to change the expected result to match what is actually generated by C# (removed the extra "'s in the split strings)
Solution 1:
It annoys me that there's no function to split a string based on a function that examines each character. If there was, you could write it like this:
public static IEnumerable<string> SplitCommandLine(string commandLine)
{
bool inQuotes = false;
return commandLine.Split(c =>
{
if (c == '\"')
inQuotes = !inQuotes;
return !inQuotes && c == ' ';
})
.Select(arg => arg.Trim().TrimMatchingQuotes('\"'))
.Where(arg => !string.IsNullOrEmpty(arg));
}
Although having written that, why not write the necessary extension methods. Okay, you talked me into it...
Firstly, my own version of Split that takes a function that has to decide whether the specified character should split the string:
public static IEnumerable<string> Split(this string str,
Func<char, bool> controller)
{
int nextPiece = 0;
for (int c = 0; c < str.Length; c++)
{
if (controller(str[c]))
{
yield return str.Substring(nextPiece, c - nextPiece);
nextPiece = c + 1;
}
}
yield return str.Substring(nextPiece);
}
It may yield some empty strings depending on the situation, but maybe that information will be useful in other cases, so I don't remove the empty entries in this function.
Secondly (and more mundanely) a little helper that will trim a matching pair of quotes from the start and end of a string. It's more fussy than the standard Trim method - it will only trim one character from each end, and it will not trim from just one end:
public static string TrimMatchingQuotes(this string input, char quote)
{
if ((input.Length >= 2) &&
(input[0] == quote) && (input[input.Length - 1] == quote))
return input.Substring(1, input.Length - 2);
return input;
}
And I suppose you'll want some tests as well. Well, alright then. But this must be absolutely the last thing! First a helper function that compares the result of the split with the expected array contents:
public static void Test(string cmdLine, params string[] args)
{
string[] split = SplitCommandLine(cmdLine).ToArray();
Debug.Assert(split.Length == args.Length);
for (int n = 0; n < split.Length; n++)
Debug.Assert(split[n] == args[n]);
}
Then I can write tests like this:
Test("");
Test("a", "a");
Test(" abc ", "abc");
Test("a b ", "a", "b");
Test("a b \"c d\"", "a", "b", "c d");
Here's the test for your requirements:
Test(@"/src:""C:\tmp\Some Folder\Sub Folder"" /users:""[email protected]"" tasks:""SomeTask,Some Other Task"" -someParam",
@"/src:""C:\tmp\Some Folder\Sub Folder""", @"/users:""[email protected]""", @"tasks:""SomeTask,Some Other Task""", @"-someParam");
Note that the implementation has the extra feature that it will remove quotes around an argument if that makes sense (thanks to the TrimMatchingQuotes function). I believe that's part of the normal command-line interpretation.
Solution 2:
In addition to the good and pure managed solution by Earwicker, it may be worth mentioning, for sake of completeness, that Windows also provides the CommandLineToArgvW
function for breaking up a string into an array of strings:
LPWSTR *CommandLineToArgvW( LPCWSTR lpCmdLine, int *pNumArgs);
Parses a Unicode command line string and returns an array of pointers to the command line arguments, along with a count of such arguments, in a way that is similar to the standard C run-time argv and argc values.
An example of calling this API from C# and unpacking the resulting string array in managed code can be found at, “Converting Command Line String to Args[] using CommandLineToArgvW() API.” Below is a slightly simpler version of the same code:
[DllImport("shell32.dll", SetLastError = true)]
static extern IntPtr CommandLineToArgvW(
[MarshalAs(UnmanagedType.LPWStr)] string lpCmdLine, out int pNumArgs);
public static string[] CommandLineToArgs(string commandLine)
{
int argc;
var argv = CommandLineToArgvW(commandLine, out argc);
if (argv == IntPtr.Zero)
throw new System.ComponentModel.Win32Exception();
try
{
var args = new string[argc];
for (var i = 0; i < args.Length; i++)
{
var p = Marshal.ReadIntPtr(argv, i * IntPtr.Size);
args[i] = Marshal.PtrToStringUni(p);
}
return args;
}
finally
{
Marshal.FreeHGlobal(argv);
}
}
Solution 3:
The Windows command-line parser behaves just as you say, split on space unless there's a unclosed quote before it. I would recommend writing the parser yourself. Something like this maybe:
static string[] ParseArguments(string commandLine)
{
char[] parmChars = commandLine.ToCharArray();
bool inQuote = false;
for (int index = 0; index < parmChars.Length; index++)
{
if (parmChars[index] == '"')
inQuote = !inQuote;
if (!inQuote && parmChars[index] == ' ')
parmChars[index] = '\n';
}
return (new string(parmChars)).Split('\n');
}