Convert backslash-delimited string into an associative array
I have a string like this:
key1\value1\key2\value2\key3\value3\key4\value4\key5\value5
And I'd like it to be an associative array so that I can do:
echo $myArray['key1']; // prints value1
echo $myArray['key3']; // prints value3
//etc...
I know I can explode on the backslash, but not sure how to go from there.
Using a simple regex via preg_match_all
and array_combine
is often the shortest and quickest option:
preg_match_all("/([^\\\\]+)\\\\([^\\\\]+)/", $string, $p);
$array = array_combine($p[1], $p[2]);
Now this is of course a special case. Both keys and values are separated by a \ backslash, as are all pairs of them. The regex is also a bit lengthier due to the necessary double escaping.
However this scheme can be generalized to other key:value,
-style strings.
Distinct key:value,
separators
Common variations include : and = as key/value separators, and , or & and others as pair delimiters. The regex becomes rather obvious in such cases (with the /x
flag for readability):
# ↓ ↓ ↓
preg_match_all("/ ([^:]+) : ([^,]+) /x", $string, $p);
$array = array_combine($p[1], $p[2]);
Which makes it super easy to exchange :
and ,
for other delimiters.
- Equal signs
=
instead of:
colons. - For example
\\t
as pair delimiter (tab-separated key:value lists) - Classic
&
or;
as separator between key=value pairs. - Or just
\\s
spaces or\\n
newlines even.
Allow varying delimiters
You can make it more flexible/forgiving by allowing different delimiters between keys/values/pairs:
# ↓ ↓ ↓
preg_match_all("/ ([^:=]+) [:=]+ ([^,+&]+) /x", $string, $p);
Where both key=value,key2:value2++key3==value3
would work. Which can make sense for more human-friendlinies (AKA non-technical users).
Constrain alphanumeric keys
Oftentimes you may want to prohibit anything but classic key
identifiers. Just use a \w+
word string pattern to make the regex skip over unwanted occurences:
# ↓ ↓ ↓
preg_match_all("/ (\w+) = ([^,]+) /x", $string, $p);
This is the most trivial whitelisting approach. If OTOH you want to assert/constrain the whole key/value string beforehand, then craft a separate preg_match("/^(\w+=[^,]+(,|$))+/", …
Strip spaces or quoting
You can skip a few post-processing steps (such as trim
on keys and values) with a small addition:
preg_match_all("/ \s*([^=]+) \s*=\s* ([^,]+) (?<!\s) /x", $string, $p);
Or for instance optional quotes:
preg_match_all("/ \s*([^=]+) \s*=\s* '? ([^,]+) (?<![\s']) /x", $string, $p);
INI-style extraction
And you can craft a baseline INI-file extraction method:
preg_match_all("/^ \s*(\w+) \s*=\s* ['\"]?(.+?)['\"]? \s* $/xm", $string, $p);
Please note that this is just a crude subset of common INI schemes.
Alternative: parse_str()
If you have a key=value&key2=value2
string already, then parse_str
works like a charm. But by combining it with strtr
can even process varying other delimiters:
# ↓↓ ↑↑
parse_str(strtr($string, ":,", "=&"), $pairs);
Which has a couple of pros and cons of its own:
- Even shorter than the two-line regex approach.
- Predefines a well-known escaping mechanism, such as
%2F
for special characters). - Does not permit varying delimiters, or unescaped delimiters within.
- Automatically converts
keys[]=
to arrays, which you may or may not want though.
Alternative: explode
+ foreach
You'll find many examples of manual key/value string expansion. Though this is often more code. explode
is somewhat overused in PHP due to optimization assumptions. After profiling often turns out to be slower however due to the manual foreach
and array collection.
What about something like this :
$str = 'key1\value1\key2\value2\key3\value3\key4\value4\key5\value5';
$list = explode('\\', $str);
$result = array();
for ($i=0 ; $i<count($list) ; $i+=2) {
$result[ $list[$i] ] = $list[$i+1];
}
var_dump($result);
Which would get you :
array
'key1' => string 'value1' (length=6)
'key2' => string 'value2' (length=6)
'key3' => string 'value3' (length=6)
'key4' => string 'value4' (length=6)
'key5' => string 'value5' (length=6)
Basically, here, the idea is to :
- split the string
- which will give you an array such as
'key1', 'value1', 'key2', 'value2', ...
- and, then, iterate over this list, with a jump of 2, using each time :
- one element as the key -- the one pointed by
$i
- the one just after it as the value -- the one pointed by
$i+1
- one element as the key -- the one pointed by