Multiple applications using same multicast group on the same Windows host

I am a software engineer designing a distributed software, in which the startup procedure utilizes IP multicasting to discover its peers. The software itself is distributed as different executable modules and as such, it sometimes make sense to run several executable modules on the same host. This is where my issues begin, as it seems to me that Windows is not good at managing several processes subscribing to a single multicast group.

My aim is to freely be able to choose on which hosts I start which executables, and these will discover all its peers (be it on the same host or on other hosts) without any predefined knowledge.

So far, troubleshooting indicates that the issue is how Windows handles the scenario of multiple processes subscribing to the same multicast groups, because:

  1. If the same process is used for sending and receiving multicast datagrams on the same machine, it works as expected. The process can be split into several threads without any issue.

  2. If I run different processes for sending and receiving, the receiving process receives nothing, despite the group join-message and all datagrams shows up in Wireshark.

  3. The scenario described in 2 works if I prior to listening also use the same socket to send a packet to the multicast group which I have joined. Then the latter receives multicast datagrams for an unspecified amount of time, upon which then solely stops receiving datagrams (it keeps waiting for datagrams to come). The datagrams are confirmed being sent to/from the network using Wireshark.

  4. My latest findings indicate that if I periodically send a multicast message to the group which I subscribe to, I receive datagrams sent to this multicast group, also from other processes on the same host.

IP multicast does, to the best of my understanding, only define communication between hosts, while it is the responsibility of the OS to redirect incoming packets to the appropriate application. As packets seem to always show up in Wireshark, even if an application is not receiving them, it seems that Windows fails in handling incoming packages, or at least deliver them to appropriate applications.

I appreaciate if anyone could either confirm or reject my reasoning as well as to point me in the right direction on how to solve this problem. The goal is to be able for several applications on the same host to join a single multicast group channel and also receive messages without require to also send "junk" to the multicast group in order for them to receive data (the workaround described as point number 4).

I am using Java for implementation and can, if requested, post a MWE here. However, I fear that it may shift focus from the scenario to programming, which is not the concern here (from what I can deduce).


Solution 1:

The OP is asking for a method to discover peers on a network with no knowledge of those peers, according to this source mDNS requires at least knowledge of the host name of each peer.

As a result, mDNS is not a viable solution to this problem as described.

According to this answer, about multicast, what you have asked for is possible. It involves using the SO_REUSEADDR socket option. When you use this socket option, it allows multiple sockets to listen on the same address:port combination, so long as;

  1. The address is a multicast address; and
  2. SO_RESUSEADDR is set on each and every socket that attempts to bind to the address:port combination.

However, according to this answer, you may also need to set SO_BROADCAST on each socket too, which is certainly not intuitive! In my own tests using python on windows, however, I did NOT need to do this, so not sure how legitimate that answer is, I mention it in case you find you get stuck, though I treat this as potentially outdated, I am testing on Windows 10 at the moment.

Here is an example of code that i have tested produces no errors, its part of an automated test that uses beacons to connect a worker process to a broker process. This test passes when run on a single node, haven't tested it yet on multiple nodes

import socket, struct

MCAST_GRP = ''
MCAST_PORT = 1234

MULTICAST_TTL = 2

def create_beacon_listener():

    sock = socket.socket(
        socket.AF_INET,
        socket.SOCK_DGRAM,
        socket.IPPROTO_UDP
    )
    sock.setsockopt(
        socket.SOL_SOCKET,
        socket.SO_REUSEADDR,
        1
    )
    sock.setblocking(0)

    sock.bind(( MCAST_GRP, MCAST_PORT ))

    multicast_grp_req = struct.pack(
        "4sl",
        socket.inet_aton(MCAST_GRP),
        socket.INADDR_ANY
    )

    sock.setsockopt(
        socket.IPPROTO_IP,
        socket.IP_ADD_MEMBERSHIP,
        multicast_grp_req
    )
    
    return sock

The above code returns a listening socket on the desired group address MCAST_GRP and port MCAST_PORT. And yes, the MCAST_GRP is set to a blank string, which means you actually listen on all multicast address groups. This seems to be a limitation in either windows, or my knowledge, probably the latter.

Either way, using a specific address in the MCAST_GRP variable will result in an error, so using a blank string at least gets the code to solve the problem the OP described.

This code can be reused multiple times within the same process, though I have not yet tested if multiple processes can use it successfully. I have tested using python's asyncio framework, so multiple sockets listening in the same thread.

As an aside, for cross platform support, you can also get this code to work on nix operating systems, and on those systems you can set MCAST_GRP to a specific address.

EDIT:

I recently came across this resource from MS which gives an example that appears to allow using a specific multicast group address rather than binding to all. It involves setting some more options in the multicast_grp_req variable. Also the example is C++ code, not python, but they are similar enough to get the gist.

Hope this helps :)