Newbetuts
.
New posts in sse
Do I get a performance penalty when mixing SSE integer/float SIMD instructions
c
assembly
sse
simd
intrinsics
How to determine if memory is aligned?
c
optimization
memory
sse
simd
Sum reduction of unsigned bytes without overflow, using SSE2 on Intel
x86
sse
simd
sse2
sse3
Is SSE floating-point arithmetic reproducible?
.net
floating-point
sse
ieee-754
x87
Fast method to copy memory with translation - ARGB to BGR
c
x86
rgb
sse
micro-optimization
How to check if a CPU supports the SSE3 instruction set?
c++
sse
instruction-set
avx
cpuid
C++ error: ‘_mm_sin_ps’ was not declared in this scope
c++
optimization
sse
simd
intrinsics
Fastest Implementation of the Natural Exponential Function Using SSE
c
optimization
vectorization
sse
simd
Loop unrolling to achieve maximum throughput with Ivy Bridge and Haswell
c++
x86
intel
sse
avx
SSE, intrinsics, and alignment
c++
alignment
sse
intrinsics
How to detect SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI availability at compile-time?
gcc
clang
sse
avx
avx512
How to sum __m256 horizontally?
sse
vectorization
intrinsics
avx
How is a vector's data aligned?
c++
vector
sse
memory-alignment
allocator
What's the difference between logical SSE intrinsics?
c
sse
simd
intrinsics
sse2
Fastest way to compute absolute value using SSE
x86
vectorization
sse
simd
absolute-value
How to use Fused Multiply-Add (FMA) instructions with SSE/AVX
c
sse
cpu-architecture
avx
fma
Websocket transport reliability (Socket.io data loss during reconnection)
node.js
websocket
socket.io
sse
eventsource
Using AVX CPU instructions: Poor performance without "/arch:AVX"
c++
performance
visual-studio-2010
sse
avx
How do I enable SSE for my freestanding bootable code?
x86
sse
instruction-set
Fast vectorized rsqrt and reciprocal with SSE/AVX depending on precision
performance
sse
simd
avx
Prev
Next