As part of a compiler project I have to write GNU assembler code for x86 to compare floating point values. I have tried to find resources on how to do this online and from what I understand it works like this:

Assuming the two values I want to compare are the only values on the floating point stack, then the fcomi instruction will compare the values and set the CPU-flags so that the je, jne, jl, ... instructions can be used.

I'm asking because this only works sometimes. For example:

.section    .data
msg:    .ascii "Hallo\n\0"
f1:     .float 10.0
f2:     .float 9.0

.globl main
    .type   main, @function
main:
    flds f1
    flds f2
    fcomi
    jg leb
    pushl $msg
    call printf
    addl $4, %esp
leb:
    pushl $0
    call exit

will not print "Hallo" even though I think it should, and if you switch f1 and f2 it still won't which is a logical contradiction. je and jne however seem to work fine.

What am I doing wrong?

PS: does the fcomip pop only one value or does it pop both?


Solution 1:

TL:DR: Use above / below conditions (like for unsigned integer) to test the result of compares.

For various historical reasons (mapping from FP status word to FLAGS via fcom / fstsw / sahf which fcomi (new in PPro) matches), FP compares set CF, not OF / SF. See also http://www.ray.masmcode.com/tutorial/fpuchap7.htm


This is all coming from Volume 2 of Intel 64 and IA-32 Architectures Software Developer's Manuals.

FCOMI sets only some of the flags that CMP does. Your code has %st(0) == 9 and %st(1) == 10. (Since it's a stack they're loaded onto), referring to the table on page 3-348 in Volume 2A you can see that this is the case "ST0 < ST(i)", so it will clear ZF and PF and set CF. Meanwhile on pg. 3-544 Vol. 2A you can read that JG means "Jump short if greater (ZF=0 and SF=OF)". In other words it's testing the sign, overflow and zero flags, but FCOMI doesn't set sign or overflow!

Depending on which conditions you wish to jump, you should look at the possible comparison results and decide when you want to jump.

+--------------------+---+---+---+
| Comparison results | Z | P | C |
+--------------------+---+---+---+
| ST0 > ST(i)        | 0 | 0 | 0 |
| ST0 < ST(i)        | 0 | 0 | 1 |
| ST0 = ST(i)        | 1 | 0 | 0 |
| unordered          | 1 | 1 | 1 |  one or both operands were NaN.
+--------------------+---+---+---+

I've made this small table to make it easier to figure out:

+--------------+---+---+-----+------------------------------------+
| Test         | Z | C | Jcc | Notes                              |
+--------------+---+---+-----+------------------------------------+
| ST0 < ST(i)  | X | 1 | JB  | ZF will never be set when CF = 1   |
| ST0 <= ST(i) | 1 | 1 | JBE | Either ZF or CF is ok              |
| ST0 == ST(i) | 1 | X | JE  | CF will never be set in this case  |
| ST0 != ST(i) | 0 | X | JNE |                                    |
| ST0 >= ST(i) | X | 0 | JAE | As long as CF is clear we are good |
| ST0 > ST(i)  | 0 | 0 | JA  | Both CF and ZF must be clear       |
+--------------+---+---+-----+------------------------------------+
Legend: X: don't care, 0: clear, 1: set

In other words the condition codes match those for using unsigned comparisons. The same goes if you're using FMOVcc.

If either (or both) operand to fcomi is NaN, it sets ZF=1 PF=1 CF=1. (FP compares have 4 possible results: >, <, ==, or unordered). If you care what your code does with NaNs, you may need an extra jp or jnp. But not always: for example, ja is only true if CF=0 and ZF=0, so it will be not-taken in the unordered case. If you want the unordered case to take the same execution path as below or equal, then ja is all you need.


Here you should use JA if you want it to print (ie. if (!(f2 > f1)) { puts("hello"); }) and JBE if you don't (corresponds to if (!(f2 <= f1)) { puts("hello"); }). (Note this might be a little confusing due to the fact that we only print if we don't jump).


Regarding your second question: by default fcomi doesn't pop anything. You want its close cousin fcomip which pops %st0. You should always clear the fpu register stack after usage, so all in all your program ends up like this assuming you want the message printed:

.section    .rodata
msg:    .ascii "Hallo\n\0"
f1:     .float 10.0
f2:     .float 9.0 

.globl main
    .type   main, @function
main:
    flds   f1
    flds   f2
    fcomip
    fstp   %st(0) # to clear stack
    ja     leb # won't jump, jbe will
    pushl  $msg
    call   printf
    addl   $4, %esp
leb:
    pushl  $0
    call   exit