How does the internet really work? [closed]
There is much information I have seen on TCP/IP protocols/layers, DNS, LANs, VPNs, NAT schemes, SSL/TLS/etc., and the like, which are, I'd say, are more "user-facing" aspects of the way the internet works. But try as I might, it is hard to learn how the internet really works (its "internal" parts etc.).
Some example questions, to show what I mean by this...
- When I send a message to a computer over the internet, where (what kinds of places/organizations, and physically where) does the message go through until it reaches its destination?
- Why do I need to get internet from an ISP? Why can I not just connect straight to the internet?
- What composes the "backbone", the main core of the internet, and how does this work? Is this a secret, maybe?
So...
1) How does the internet really work; what makes it "spin"?
2) Is it possible find more good information on these things on the web, and if yes, what are some good resources for this?
Solution 1:
The internet is a network of networks.
Lets say you have a network of 10 systems, each with an ip address, and Tom had a network, and alice had a network. You'd need a seperate connection to Tom and Alice to talk to either, and the associated cost.
Now, lets say Tom is connected to Alice, and you're connected to Tom, and Tom lets you connect to Alice through him - that is peering.
Imagine needing to connect to a hundred different people separately. You can't have intercontinental point to point networks, so you have a series of very high capacity connections, which are very expensive to maintain. You could, in theory hook yourself into the main backbone or run your own, but its cheaper to buy it from a reseller, IE an ISP. The ISP also has peering agreements (so you don't have to make agreements seperately with Tom, Alice, Ali, Ivan, Ravi, Vanda....).
The internet works since it ties together these varied, utterly disconnected networks in a coherent way. Practically each 'network' is an AS which is a collection of networks.
Now that we have gotten an overview, you can trace the route you would take to a server with tracert in windows and traceroute in linux. Each route would have hops within the ISP, over to a larger ISP, and to your final location
geek@tamandua:~/pystatgrab-0.5/glances-1.1.3$ traceroute www.superuser.com
traceroute to www.superuser.com (64.34.119.12), 30 hops max, 60 byte packets
1 menu (192.168.1.254) 7.264 ms 7.224 ms 7.192 ms
2 bb219-74-xxx-x.singnet.com.sg (219.74.xxx.x) 17.088 ms 18.808 ms 20.773 ms
3 202.166.xxx.xx (202.166.xxx.xxx) 22.701 ms 24.651 ms 26.585 ms
4 xe-0-0-0-3000.qt-ar04.singnet.com.sg (202.166.121.129) 28.496 ms 30.633 ms 32.386 ms
5 xe-8-3-0-0.qt-cr02.singnet.com.sg (202.166.126.209) 34.427 ms 36.272 ms 38.153 ms
6 ae6-0.singha.singnet.com.sg (202.166.120.186) 40.136 ms 13.885 ms 13.848 ms
7 ae5-0.beck.singnet.com.sg (202.166.126.41) 15.732 ms 12.018 ms 13.772 ms
8 203.208.190.57 (203.208.190.57) 17.938 ms 17.923 ms 19.544 ms
9 ge-1-0-0-0.sngc3-dr1.ix.singtel.com (203.208.173.134) 21.731 ms 203.208.171.213 (203.208.171.213) 23.515 ms 203.208.171.217 (203.208.171.217) 27.320 ms
10 ge-1-1-3-0.sngtp-dr2.ix.singtel.com (203.208.152.21) 29.300 ms 29.313 ms 203.208.171.197 (203.208.171.197) 31.083 ms
11 so-3-0-0-0.laxow-cr1.ix.singtel.com (203.208.151.222) 212.783 ms so-2-0-0-0.laxow-cr1.ix.singtel.com (203.208.151.86) 226.137 ms 202.607 ms
12 203.208.153.142 (203.208.153.142) 204.518 ms 208.651 ms ge-7-0-0-0.laxow-dr2.ix.singtel.com (203.208.183.158) 209.639 ms
13 peer1.com.any2ix.coresite.com (206.223.143.79) 197.931 ms 199.860 ms 213.576 ms
14 10ge.ten1-1.la-600w-cor-2.peer1.net (216.187.88.146) 203.925 ms 219.400 ms 221.328 ms
15 10ge-ten1-2.dal-eqx-cor-1.peer1.net (216.187.124.122) 266.703 ms 266.687 ms 268.531 ms
16 10ge-ten1-1.dal-eqx-cor-2.peer1.net (216.187.124.134) 282.273 ms 247.504 ms 249.410 ms
17 10ge-ten2-1.atl-telx-cor-1.peer1.net (216.187.124.118) 251.279 ms 253.250 ms 255.212 ms
18 10ge-ten1-1.atl-101mar-cor-1.peer1.net (216.187.120.226) 246.224 ms 262.020 ms 252.336 ms
19 10ge.xe-1-0-0.wdc-eqx-dis-1.peer1.net (216.187.115.37) 281.690 ms 269.931 ms 285.666 ms
20 10ge.ten1-2.wdc-sp2-cor-1.peer1.net (216.187.115.234) 287.404 ms 289.290 ms 291.204 ms
21 216.187.120.254 (216.187.120.254) 293.154 ms 295.091 ms 263.393 ms
22 10ge.xe-2-0-0.nyc-telx-dis-1.peer1.net (216.187.115.221) 265.291 ms 267.265 ms 282.774 ms
23 10ge.xe-0-0-0.nyc-telx-dis-2.peer1.net (216.187.115.182) 278.996 ms 267.974 ms 271.307 ms
24 oc48-po3-0.nyc-75bre-dis-1.peer1.net (216.187.115.134) 273.482 ms 275.482 ms 277.317 ms
25 gwny01.stackoverflow.com (64.34.41.58) 292.767 ms 294.730 ms 296.702 ms
In this case, I'm four hops from singtel's local exchange (XE), nine hops to singtel's routers named after beer, 11 hops to their LA exchange (laxow), transferred by peer 1 in LA through to peer 1 in New York. Finally, the ISP in hop 25 can pass our traffic to Stack Overflow's servers. Our traffic with Stack Overflow, in this instance, travels through 25 connected networks until it arrives at Stack Overflow's servers.
Singtel is an AS for our purposes, as is peer 1.
These routes are decided by BGP between networks (so that i would connect from singtel to peer 1 LA) and IRP within an AS.
Hypothetically you COULD run your own AS, make your own peering agreements and so on, but it would be very costly
Solution 2:
Here's a very high-level view.
The Internet is basically a world-wide group of networked computers, to facilitate the huge amount of traffic transported across these networks, governments and private companies lay massive cables between countries, these major cables are the 'backbone' of the Internet. Occasionally a ship will drag an anchor over one of these and damage or possibly even break it, if this happens it can cause a major outage for a particular country.
To get connected to this backbone you need to pay fees to the owner of the cable and you need the hardware, these are major costs, on the order of hundreds of thousands, if not millions of dollars, if you personally had the money, you could connect without an ISP. Most people find it more cost efficient to pay a small monthly fee though.
Whenever you send information across the Internet, the information has a destination, a URL for example. Network equipment finds it very hard to direct a message (split into 'packets') to a text address so network equipment called routers store internal lists of URLs matched with numbered addresses, an IP address, eg: 203.35.57.110. This is called DNS (Domain Name System), there are various tiers of DNS servers, if a DNS server cannot find an IP in its own DNS it asks its 'parent'.
These addresses are usually split into ranges across countries, ie 203.xxx is Australia. Not all networks know every single IP address, they only know a very small list of addresses, enough to direct any packet that comes their way.
Example: you want to open the superuser home page.
- You type superuser.com into your browser and hit enter.
- You computer looks at it's internal DNS for superuser.com and converts it to an IP, if it doesn't find it in its DNS it asks your ISP's DNS and so on up the chain until it gets an IP (64.34.119.12).
- Your computer asks your ISP's router to send request packets to 64.34.119.12
- Your ISP's router looks at its routing tables to see where it should send your request, it sees that 64.x.x.x is outside of its local network and therefore it cannot send it directly, so it sends the request to a higher level router.
- This goes on and on until the packet hits a router that knows that this IP is somewhere in the US, so it then sends the packet to the closest router to the US that it knows about.
- The US router sees that the next number in the IP is 34, and if knows that this is on the East coast somewhere, so it sends it in that direction.
- Somewhere down the line a router sees that the IP is allocated to a particular ISP, so it sends it to that ISP.
- The router at the ISP sees that the IP is one of its own and knows exactly which machine to send it to.
- The receiving machine gets the request and sees that you have requested the home page, so it gets the data together and sends it back to you.
- The process starts all over again.
This all happens in a matter of milliseconds.
This is just a simplified general overview, the IPs can be allocated in different ways, a very large company or military organisation might own all of 125.xxx.xxx.xxx, yet a small country may only be allocated 275.24.xxx.xxx.