what is the order of source operands in AT&T syntax compared to Intel syntax?

The Intel ISA reference documentation for this instruction is clear:

VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4

Select byte values from xmm2 and xmm3/m128 using mask bits in the specified mask register, xmm4, and store the values into xmm1.

xmm1 is the destination, xmm2/3/4 are source operands

So what does this become using AT&T syntax? We know that the destination register must be last, but what is the order of source operands?

vpblendvb $xmm2, $xmm3, $xmm4, $xmm1

or

vpblendvb $xmm4, $xmm3, $xmm2, $xmm1

or something else?


Assembling (note GAS uses % instead of $ to denote registers) the following:

vpblendvb %xmm4, %xmm3, %xmm2, %xmm1

with the GNU assembler (version 2.21.0.20110327 on x86_64 2.6.38 linux) and then disassembling yields:

$ objdump -d a.out
    0:    c4 e3 69 4c cb 40     vpblendvb %xmm4,%xmm3,%xmm2,%xmm1

in intel syntax (as the manual shows):

$ objdump -d -M intel a.out
    0:    c4 e3 69 4c cb 40     vpblendvb xmm1,xmm2,xmm3,xmm4

So it looks like the order of all the arguments is reversed.