Why Swift is 100 times slower than C in this image processing test? [duplicate]
Like many other developers I have been very excited about the new Swift language from Apple. Apple has claimed its speed is faster than Objective C and can be used to write operating system. And from what I learned so far, it's a static typed language and able to have precisely control over the exact data type (like integer length). So it does look like having good potential handling performance critical tasks, like image processing, right?
That's what I thought before I carried out a quick test. The result really surprised me.
Here is a simple code snippet in C:
test.c:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
uint8_t pixels[640*480];
uint8_t alpha[640*480];
uint8_t blended[640*480];
void blend(uint8_t* px, uint8_t* al, uint8_t* result, int size)
{
for(int i=0; i<size; i++) {
result[i] = (uint8_t)(((uint16_t)px[i]) *al[i] /255);
}
}
int main(void)
{
memset(pixels, 128, 640*480);
memset(alpha, 128, 640*480);
memset(blended, 255, 640*480);
// Test 10 frames
for(int i=0; i<10; i++) {
blend(pixels, alpha, blended, 640*480);
}
return 0;
}
I compiled it on my Macbook Air 2011 with the following command:
clang -O3 test.c -o test
The 10 frame processing time is about 0.01s. In other words, it takes the C code 1ms to process one frame:
$ time ./test
real 0m0.010s
user 0m0.006s
sys 0m0.003s
Then I have a Swift version of the same code:
test.swift:
let pixels = UInt8[](count: 640*480, repeatedValue: 128)
let alpha = UInt8[](count: 640*480, repeatedValue: 128)
let blended = UInt8[](count: 640*480, repeatedValue: 255)
func blend(px: UInt8[], al: UInt8[], result: UInt8[], size: Int)
{
for(var i=0; i<size; i++) {
var b = (UInt16)(px[i]) * (UInt16)(al[i])
result[i] = (UInt8)(b/255)
}
}
for i in 0..10 {
blend(pixels, alpha, blended, 640*480)
}
The build command line is:
xcrun swift -O3 test.swift -o test
Here I use the same O3
level optimization flag to make the comparison hopefully fair. However, the resulting speed is 100 time slower:
$ time ./test
real 0m1.172s
user 0m1.146s
sys 0m0.006s
In other words, it takes Swift ~120ms to processing one frame which takes C just 1 ms.
What happened?
Update: I am using clang:
$ gcc -v
Configured with: --prefix=/Applications/Xcode6-Beta.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.34.4) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.2.0
Thread model: posix
Update: more results with different running iterations:
Here are the result for different number of "frames", i.e. change the main for
loop number from 10 to other numbers. Note now I am getting even faster C code time (cache hot?), while the Swift time doesn't change too much:
C Time (s) Swift Time (s)
1 frame: 0.005 0.130
10 frames(*): 0.006 1.196
20 frames: 0.008 2.397
100 frames: 0.024 11.668
Update: `-Ofast` helps
With -Ofast
suggested by @mweathers, the Swift speed goes up to reasonable range.
On my laptop the Swift version with -Ofast
gets 0.013s for 10 frames and 0.048s for 100 frames, close to half of the C performance.
Building with:
xcrun swift -Ofast test.swift -o test
I'm getting times of:
real 0m0.052s
user 0m0.009s
sys 0m0.005s
Let's just concentrate on the answer to the question, which started with a "Why": Because you didn't turn optimisations on, and Swift relies heavily on compiler optimisation.
That said, doing image processing in C is truly daft. That's what you have CGImage and friends for.