isset() vs strlen() - a fast/clear string length calculation
I came across this code...
if(isset($string[255])) {
// too long
}
isset() is between 6 and 40 faster than
if(strlen($string) > 255) {
// too long
}
The only drawback to the isset() is that the code is unclear - we cannot tell right away what is being done (see pekka's answer). We can wrap isset() within a function i.e. strlt($string,255) but we then loose the speed benefits of isset().
How can we use the faster isset() function while retaining readability of the code?
EDIT : test to show the speed http://codepad.org/ztYF0bE3
strlen() over 1000000 iterations 7.5193998813629
isset() over 1000000 iterations 0.29940009117126
EDIT2 : here's why isset() is faster
$string = 'abcdefg';
var_dump($string[2]);
Output: string(1) “c”
$string = 'abcdefg';
if (isset($string[7])){
echo $string[7].' found!';
}else{
echo 'No character found at position 7!';
}
This is faster than using strlen() because, “… calling a function is more expensive than using a language construct.” http://www.phpreferencebook.com/tips/use-isset-instead-of-strlen/
EDIT3 : I was always taught to be interested in mirco-optimisation. Probably because I was taught at a time when resources on computers were tiny. I'm open to the idea that it may not be important, there are some good arguments against it in the answers. I've started a new question exploring this... https://stackoverflow.com/questions/6983208/is-micro-optimisation-important-when-coding
OK so I ran the tests since I could hardly believe that the isset() method is faster, but yes it is, and considerably so. The isset() method is consistently about 6 times faster.
I have tried with strings of various sizes and running a varying amount of iterations; the ratios remain the same, and also the total running length by the way (for strings of varying sizes), because both isset() and strlen() are O(1) (which makes sense - isset only needs to do a lookup in a C array, and strlen() only returns the size count that is kept for the string).
I looked it up in the php source, and I think I roughly understand why. isset(), because it is not a function but a language construct, has its own opcode in the Zend VM. Therefore, it doesn't need to be looked up in the function table and it can do more specialized parameter parsing. Code is in zend_builtin_functions.c for strlen() and zend_compile.c for isset(), for those interested.
To tie this back to the original question, I don't see any issues with the isset() method from a technical point of view; but imo it is harder to read for people who are not used to the idiom. Futhermore, the isset() method will be constant in time, while the strlen() method will be O(n) when varying the amount of functions that are build into PHP. Meaning, if you build PHP and statically compile in many functions, all function calls (including strlen()) will be slower; but isset() will be constant. However this difference will in practice be negligible; I also don't know how many function pointer tables are maintained, so if user-defined functions also have an influence. I seem to remember they are in a different table and therefore are irrelevant for this case, but it's been a while since I last really worked with this.
For the rest I don't see any drawbacks to the isset() method. I don't know of other ways to get the length of a string, when not considering purposefully convoluted ones like explode+count and things like that.
Finally, I also tested your suggestion above of wrapping isset() into a function. This is slower than even the strlen() method because you need another function call, and therefore another hash table lookup. The overhead of the extra parameter (for the size to check against) is negligible; as is the copying of the string when not passed by reference.
Any speed difference in this is of absolutely no consequence. It will be a few milliseconds at best.
Use whatever style is best readable to you and anybody else working on the code - I personally would strongly vote for the second example because unlike the first one, it makes the intention (checking the length of a string) absolutely clear.
Your code is incomplete.
Here, I fixed it for you:
if(isset($string[255])) {
// something taking 1 millisecond
}
vs
if(strlen($string) > 255) {
// something taking 1 millisecond
}
Now you don't have an empty loop, but a realistic one. Lets consider it takes 1 millisecond to do something.
A modern CPU can do a lot of things in 1 millisecond - that is given. But things like a random hard drive access or a database request take multiple milliseconds - also a realistic scenario.
Now lets calculate timings again:
realistic routine + strlen() over 1000000 iterations 1007.5193998813629
realistic routine + isset() over 1000000 iterations 1000.29940009117126
See the difference?