Bittorrent ports, why do I need them?
I'm considering a file distribution between branch offices that uses Bittorrent. I understand that a Bittorrent client needs ports in the range of 6881-6999 to be forwarded to the internet to make the transfer faster.
What I don't understand is: how does this make things faster? I could understand if failing to provide proper means of communication between clients would prevent them from speaking to each other. But everywhere I look I just see the advice "Just forward the ports and the transfer will speed up".
Sorry if this seems off topic, but it strikes me as network related.
The common example of a P2P protocol is Bittorrent. In this protocol the communications are often managed by a tracker. This means for data transfer, a minimum of three nodes are needed:
+-----+ +---------+ +------+
| | 1.1.1.1:500 | | | |
| You |<------------->| Tracker |<---->| Peer |
| | | | | |
+-----+ +---------+ +------+
The connection for you starts with telling the tracker your IP address and port you are connectable on. The tracker then stores this in a state table:
+--------------+------------+
| Nodes | Completion |
| 1.1.1.1:500 | 0% |
| 2.2.2.2:1000 | 100% |
+--------------+------------|
Peer
has established he is connectible on port 1000. We'll come back to this.
Addresses 1.1.1.1
and 2.2.2.2
represent the external addresses of NAT devices. These devices are ubiquitous in today's Internet since almost every user has a router installed to provide access to several computers, mobiles, games consoles etc..
This means that behind these addresses are more addresses, one for each of these devices.
However: since one address can only be connected to a port in the range 1-65535
, how does your router know whether to connect a request for port 500 to your computer with your torrent client running? You instruct it, by providing it with a rule called a port forward to say "any connections you receive on port 500, I want to be forwarded to me" - where me
is your internal address (one of 10.x.x.x
, 192.168.x.x
or 172.16.x.x
).
As you have just joined the 'swarm' by announcing to the tracker, the tracker sends you the above state table. You know you have just joined and have 0% completion, but that Peer
has 100% completion meaning you know if you connect to him you'll be able to start getting the data.
If Peer
has not 'forwarded' his port (1000, as he reported to the tracker when he announced) however you will not be able to connect and start receiving data. This is obviously not desirable as now you cannot complete the torrent because no one is available to share it.
If Peer
has not announced since you connected, he doesn't know you exist yet. However if you have set up port forwarding correctly, when he does announce and get the new state table with you in it, he could initiate the connection with you. This will work since your port is forwarded.
If both of you did not have port forwarding enabled, then despite the fact you were both announcing to the tracker, because the ports you told the tracker about don't actually reach back to your machine, all possible data connections are blocked by your routers/NAT devices.
So in brief: port forwarding helps with the health of P2P data exchange by making it easier for connections to be established - and unless every member enables port forwarding of some kind, it is impossible to exchange data in a P2P manner.
There is a ton of poor data in this question. Bittorrent works with a "tit for tat" scheme, wherein clients that are uploading get preference in downloading. To upload data, other clients need to be able to connect in to you, which can't happen if you're NATed or firewalled off. Thus, you open ports to allow other clients to connect in, you upload some data, and you get higher priority downloads.
There's some NAT circumvention stuff in there if the other client isn't firewalled/NATed, but at least one side has to have the open ports.
If it's all your private network, you could fudge the client to not do that preferential sending, but that's probably a lot more work than just opening the ports.
Here's a trivial reference for this behavior.
Also, you don't need to use those ports. Any port range will work as long as your client knows what's open to it so that it can inform the tracker.