How is speed decided on Torrent network?

I am curious as to how the download speed is decided on a torrent network since there are no central hubs. If there are 300 seeders, why can't a client just connect to all those 300 and not actually seed anything. Is it built into the client (which i highly doubt). How does the whole sharing thing work?

PS: I am not sure if this is the right place to ask, but it surely didn't belong to Stack Overflow. Also I don't want to know how torrent downloads can be made faster. I want to know how they work.


Solution 1:

Bittorrent is not really completely "hubless" - not for the data transfer, but for peer discovery. Initially, torrents could only depend on a central hub called the tracker - again, not to exchange parts of the file, but to discover who else is on the swarm. (I believe you can specify multiple trackers in a protocol for redundancy.) With the introduction of DHT, a Bittorrent peer can use DHT to seek out other peers, in addition or instead of using a tracker. DHT itself depends on foreknowledge of a few well-known DHT "nodes" (unsure of the exact terminology) in order to "bootstrap" a peer who hasn't queried via DHT ever, or for a while.

A client is free to make a connection with each peer it knows about, simultaneously, and most usually do - with the exception of considering any "connection limit" settings that the program supports.

From the official Bittorrent spec:

Connections contain two bits of state on either end: choked or not, and interested or not. Choking is a notification that no data will be sent until unchoking happens. The reasoning and common techniques behind choking are explained later in this document.

Data transfer takes place whenever one side is interested and the other side is not choking. Interest state must be kept up to date at all times - whenever a downloader doesn't have something they currently would ask a peer for in unchoked, they must express lack of interest, despite being choked. Implementing this properly is tricky, but makes it possible for downloaders to know which peers will start downloading immediately if unchoked.

Connections start out choked and not interested.

When data is being transferred, downloaders should keep several piece requests queued up at once in order to get good TCP performance (this is called 'pipelining'.) On the other side, requests which can't be written out to the TCP buffer immediately should be queued up in memory rather than kept in an application-level network buffer, so they can all be thrown out when a choke happens.

So, for you get data from a peer, the peer must be "interested" and you must be "not choked" - according to the protocol. Further on:

Choking is done for several reasons. TCP congestion control behaves very poorly when sending over many connections at once. Also, choking lets each peer use a tit-for-tat-ish algorithm to ensure that they get a consistent download rate.

The choking algorithm described below is the currently deployed one. It is very important that all new algorithms work well both in a network consisting entirely of themselves and in a network consisting mostly of this one.

There are several criteria a good choking algorithm should meet. It should cap the number of simultaneous uploads for good TCP performance. It should avoid choking and unchoking quickly, known as 'fibrillation'. It should reciprocate to peers who let it download. Finally, it should try out unused connections once in a while to find out if they might be better than the currently used ones, known as optimistic unchoking.

The currently deployed choking algorithm avoids fibrillation by only changing who's choked once every ten seconds. It does reciprocation and number of uploads capping by unchoking the four peers which it has the best download rates from and are interested. Peers which have a better upload rate but aren't interested get unchoked and if they become interested the worst uploader gets choked. If a downloader has a complete file, it uses its upload rate rather than its download rate to decide who to unchoke.

For optimistic unchoking, at any one time there is a single peer which is unchoked regardless of its upload rate (if interested, it counts as one of the four allowed downloaders.) Which peer is optimistically unchoked rotates every 30 seconds. To give them a decent chance of getting a complete piece to upload, new connections are three times as likely to start as the current optimistic unchoke as anywhere else in the rotation.

So most Bittorrent peers implement "choking" algorithms that ensure things work fairly, but give new connections preferential treatment to give them a chance to be a good part of the swarm. One peer could try to have a different algorithm that is more unfair but without immediate cooperation from all other peers the "bad" peer would just get "choked" until it receives no data from anyone.

More peers = more speed, and faster peers are preferred. The upload capacity of a peer, any download limits you've set, and your physical link upload/download capacity also affect speed.

I (and others) have further detailed in a high level how Bittorrent works at this question.