Is there way to keep delimiter while using php explode or other similar functions?
For example, I have an article should be splitted according to sentence boundary such as ".
", "?
", "!
" and ":
".
But as well all know, whether preg_split
or explode
function, they both remove the delimiter.
Any help would be really appreciated!
EDIT:
I can only come up with the code below, it works great though.
$content=preg_replace('/([\.\?\!\:])/',"\\1[D]",$content);
Thank you!!! Everyone. It is only five minutes for getting 3 answers! And I must apologize for not being able to see the PHP manual carefully before asking question. Sorry.
I feel this is worth adding. You can keep the delimiter in the "after" string by using regex lookahead to split:
$input = "The address is http://stackoverflow.com/";
$parts = preg_split('@(?=http://)@', $input);
// $parts[1] is "http://stackoverflow.com/"
And if the delimiter is of fixed length, you can keep the delimiter in the "before" part by using lookbehind:
$input = "The address is http://stackoverflow.com/";
$parts = preg_split('@(?<=http://)@', $input);
// $parts[0] is "The address is http://"
This solution is simpler and cleaner in most cases.
You can set the flag PREG_SPLIT_DELIM_CAPTURE when using preg_split
and capture the delimiters too. Then you can take each pair of 2n and 2n+1 and put them back together:
$parts = preg_split('/([.?!:])/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
$sentences = [];
for ($i = 0, $n = count($parts) - 1; $i <= $n; $i += 2) {
$sentences[] = $parts[$i] . ($parts[$i+1] ?? '');
}
Note to pack the splitting delimiter into a group, otherwise they won’t be captured.