Poor iSCSI performance with SSD disks and 10 Gbe network

Short answer: This is the results of network latency and a serial workload (as you imposed by using direct=1, sync=1 and iodepth=1).

Long answer: using direct=1, sync=1 and iodepth=1 you created a serial workload, as new writes can not be queued before the previous write was committed and confirmed. In other word, writes submission rate strictly depend on network latency. A simple ping between two machine can very well be in the excess of 0.2ms, more so when using a higher level protocol as TCP (and iSCSI on top of it). Presuming a total network latency of about 0.33ms, you have a maximum IOPS value of about 3000. This is without accounting for other latency sources (es: the disks themselves), so it is in-line with what you recorded.

Try this: execute a first benchmark without --direct=1 --sync=1, and another with these options in place but increasing the iodepth to 32 requests. Then report here the results.