Why is my WebRTC peer-to-peer application failing to work properly?

it's been quite a long time since I've posted here. Just wanted to bounce this off of you as it has been making my brain hurt. So, I have been developing a real time video chat app with WebRTC. Now, I know that the obligatory "it's somewhere in the network stack (NAT)" answer always applies.

As is always the case it seems with WebRTC, it works perfectly in my browser and on my laptop between tabs or between Safari/Chrome. However, over the internet on HTTPS on a site I've created, it is shotty at best. It can accept and display the media stream from my iPhone but it cannot receive the media stream from my laptop. It just shows a black square on the iPhone for the remote video.

Any pointers would be most appreciate though as I've been going crazy. I know that TURN servers are an inevitable aspect of WebRTC but I'm trying to avoid employing that.

So, here is my Session class which handles essentially all the WebRTC related client side session logic:

(The publish method is just an inherited member that emulates EventTarget/EventEmitter functionality and the p2p config is just for Google's public STUN servers)

class Session extends Notifier {
    constructor(app) {
        super()

        this.app = app

        this.client = this.app.client

        this.clientSocket = this.client.socket

        this.p2p = new RTCPeerConnection(this.app.config.p2pConfig)

        this.closed = false

        this.initialize()
    }

    log(message) {
        if (this.closed) return

        console.log(`[${Date.now()}] {Session} ${message}`)
    }

    logEvent(event, message) {
        let output = event
        if (message) output += `: ${message}`
        this.log(output)
    }

    signal(family, data) {
        if (this.closed) return
        
        if (! data) return

        let msg = {}
        msg[family] = data
        
        this.clientSocket.emit("signal", msg)
    }

    initialize() {
        this.p2p.addEventListener("track", async event => {
            if (this.closed) return
            try {
                const [remoteStream] = event.streams
                this.app.mediaManager.remoteVideoElement.srcObject = remoteStream
            } catch (e) {
                this.logEvent("Failed adding track", `${e}`)
                this.close()
            }
        })

        this.p2p.addEventListener("icecandidate", event => {
            if (this.closed) return 
            if (! event.candidate) return
            this.signal("candidate", event.candidate)
            this.logEvent("Candidate", "Sent")
        })

        this.p2p.addEventListener("connectionstatechange", event => {
            if (this.closed) return
            switch (this.p2p.connectionState) {
                case "connected":
                    this.publish("opened")
                    this.logEvent("Opened")
                    break
            
                // A fail safe to ensure that faulty connections 
                // are terminated abruptly
                case "disconnected":
                case "closed":
                case "failed":
                    this.close()
                    break

                default:
                    break
            }
        })

        this.clientSocket.on("initiate", async () => {
            if (this.closed) return
            try {
                const offer = await this.p2p.createOffer()
                await this.p2p.setLocalDescription(offer)
                this.signal("offer", offer)
                this.logEvent("Offer", "Sent")
            } catch (e) {
                this.logEvent("Uninitiated", `${e}`)
                this.close()
            }

        })

        this.clientSocket.on("signal", async data => {
            if (this.closed) return
            try {
                if (data.offer) {
                    this.p2p.setRemoteDescription(new RTCSessionDescription(data.offer))
                    this.logEvent("Offer", "Received")
                    const answer = await this.p2p.createAnswer()
                    await this.p2p.setLocalDescription(answer)
                    this.signal("answer", answer)
                    this.logEvent("Answer", "Sent")
                }

                if (data.answer) {
                    const remoteDescription = new RTCSessionDescription(data.answer)
                    await this.p2p.setRemoteDescription(remoteDescription)
                    this.logEvent("Answer", "Received")
                }

                if (data.candidate) {
                    try {
                        await this.p2p.addIceCandidate(data.candidate)
                        this.logEvent("Candidate", "Added")
                    } catch (e) {
                        this.logEvent("Candidate", `Failed => ${e}`)
                    }
                }
            } catch (e) {
                this.logEvent("Signal Failed", `${e}`)
                this.close()
            }
        })

        this.app.mediaManager.localStream.getTracks().forEach(track => {
            this.p2p.addTrack(track, this.app.mediaManager.localStream)
        })
    }

    close() {
        if (this.closed) return

        this.p2p.close()
        this.app.client.unmatch()
        this.logEvent("Closed")
        
        this.closed = true
    }
}

I've worked with WebRTC well over a little while now and am deploying a production level website for many-to-many broadcasts so I can happily help you with this answer but don't hit me as I'm about to spoil some of your fun.

The Session Description Protocol you generate would had contained the send/recv IPs of both connecting users. Now because none of you are actually port-forwarded to allow this connection or to act as a host, a TURN would be in fact required to mitigate this issue. For security reasons it's like this and most users will require a TURN if you decide to go this route.

You can skip a TURN server completely but still requiring a server, you'd go the route of sending/receiving RTP and routing it like an MCU/SFU.

These solutions are designed to take in WebRTC produced tracks and output them to many viewers (consumers).

Here's a SFU I use that works great for many-to-many if you can code it. It's Node.JS friendly if you don't know other languages outside JavaScript. https://mediasoup.org/