Solution 1:

I have an application that sends 100 of 186-byte (excluding headers) TCP messages back to back without gap from host A to host B.

Then you may be sending them faster than the network can transport them, in which case, by the time the TCP implementation on the sender is ready to send a packet on the network, there may be multiple messages queued up, in which case it'll send as many as it can in a single TCP segment. The TCP protocol offers a byte-stream service, with no notion of message boundaries, so it's permitted to do that.

I have already turned on Nagle's algorithm

Nagle's algorithm explicitly does what you're saying the TCP on the sender is doing:

Nagle's algorithm works by combining a number of small outgoing messages, and sending them all at once.

so turning it on won't prevent that. Turning it off might, in some cases, prevent that, but given that your application sends a burst of messages, it probably won't prevent that.

(I.e., the answer to "why did the TCP on the sender merge the messages?" is "because it can".)

Solution 2:

What you are seeing is most likely due to functionality being offloaded from the kernel network stack to network interface and/or driver.

The network interface will still be receiving the individual packets from the network. But before the packets are handed off to the kernel they are merged by either the interface or the driver.

You can see the current settings of all the offload features using this command:

ethtool -k eth0

If you want to disable this particular feature, it can be done with this command:

ethtool -K eth0 generic-receive-offload off

You can read more about offloading in this older question.