How does the internet really work? [closed]

There is much information I have seen on TCP/IP protocols/layers, DNS, LANs, VPNs, NAT schemes, SSL/TLS/etc., and the like, which are, I'd say, are more "user-facing" aspects of the way the internet works. But try as I might, it is hard to learn how the internet really works (its "internal" parts etc.).

Some example questions, to show what I mean by this...

  • When I send a message to a computer over the internet, where (what kinds of places/organizations, and physically where) does the message go through until it reaches its destination?
  • Why do I need to get internet from an ISP? Why can I not just connect straight to the internet?
  • What composes the "backbone", the main core of the internet, and how does this work? Is this a secret, maybe?

So...

1) How does the internet really work; what makes it "spin"?

2) Is it possible find more good information on these things on the web, and if yes, what are some good resources for this?


Solution 1:

The internet is a network of networks.

Lets say you have a network of 10 systems, each with an ip address, and Tom had a network, and alice had a network. You'd need a seperate connection to Tom and Alice to talk to either, and the associated cost.

Now, lets say Tom is connected to Alice, and you're connected to Tom, and Tom lets you connect to Alice through him - that is peering.

Imagine needing to connect to a hundred different people separately. You can't have intercontinental point to point networks, so you have a series of very high capacity connections, which are very expensive to maintain. You could, in theory hook yourself into the main backbone or run your own, but its cheaper to buy it from a reseller, IE an ISP. The ISP also has peering agreements (so you don't have to make agreements seperately with Tom, Alice, Ali, Ivan, Ravi, Vanda....).

The internet works since it ties together these varied, utterly disconnected networks in a coherent way. Practically each 'network' is an AS which is a collection of networks.

Now that we have gotten an overview, you can trace the route you would take to a server with tracert in windows and traceroute in linux. Each route would have hops within the ISP, over to a larger ISP, and to your final location

geek@tamandua:~/pystatgrab-0.5/glances-1.1.3$ traceroute www.superuser.com
traceroute to www.superuser.com (64.34.119.12), 30 hops max, 60 byte packets
 1  menu (192.168.1.254)  7.264 ms  7.224 ms  7.192 ms
 2  bb219-74-xxx-x.singnet.com.sg (219.74.xxx.x)  17.088 ms  18.808 ms  20.773 ms
 3  202.166.xxx.xx (202.166.xxx.xxx)  22.701 ms  24.651 ms  26.585 ms
 4  xe-0-0-0-3000.qt-ar04.singnet.com.sg (202.166.121.129)  28.496 ms  30.633 ms  32.386 ms
 5  xe-8-3-0-0.qt-cr02.singnet.com.sg (202.166.126.209)  34.427 ms  36.272 ms  38.153 ms
 6  ae6-0.singha.singnet.com.sg (202.166.120.186)  40.136 ms  13.885 ms  13.848 ms
 7  ae5-0.beck.singnet.com.sg (202.166.126.41)  15.732 ms  12.018 ms  13.772 ms
 8  203.208.190.57 (203.208.190.57)  17.938 ms  17.923 ms  19.544 ms
 9  ge-1-0-0-0.sngc3-dr1.ix.singtel.com (203.208.173.134)  21.731 ms 203.208.171.213 (203.208.171.213)  23.515 ms 203.208.171.217 (203.208.171.217)  27.320 ms
10  ge-1-1-3-0.sngtp-dr2.ix.singtel.com (203.208.152.21)  29.300 ms  29.313 ms 203.208.171.197 (203.208.171.197)  31.083 ms
11  so-3-0-0-0.laxow-cr1.ix.singtel.com (203.208.151.222)  212.783 ms so-2-0-0-0.laxow-cr1.ix.singtel.com (203.208.151.86)  226.137 ms  202.607 ms
12  203.208.153.142 (203.208.153.142)  204.518 ms  208.651 ms ge-7-0-0-0.laxow-dr2.ix.singtel.com (203.208.183.158)  209.639 ms
13  peer1.com.any2ix.coresite.com (206.223.143.79)  197.931 ms  199.860 ms  213.576 ms
14  10ge.ten1-1.la-600w-cor-2.peer1.net (216.187.88.146)  203.925 ms  219.400 ms  221.328 ms
15  10ge-ten1-2.dal-eqx-cor-1.peer1.net (216.187.124.122)  266.703 ms  266.687 ms  268.531 ms
16  10ge-ten1-1.dal-eqx-cor-2.peer1.net (216.187.124.134)  282.273 ms  247.504 ms  249.410 ms
17  10ge-ten2-1.atl-telx-cor-1.peer1.net (216.187.124.118)  251.279 ms  253.250 ms  255.212 ms
18  10ge-ten1-1.atl-101mar-cor-1.peer1.net (216.187.120.226)  246.224 ms  262.020 ms  252.336 ms
19  10ge.xe-1-0-0.wdc-eqx-dis-1.peer1.net (216.187.115.37)  281.690 ms  269.931 ms  285.666 ms
20  10ge.ten1-2.wdc-sp2-cor-1.peer1.net (216.187.115.234)  287.404 ms  289.290 ms  291.204 ms
21  216.187.120.254 (216.187.120.254)  293.154 ms  295.091 ms  263.393 ms
22  10ge.xe-2-0-0.nyc-telx-dis-1.peer1.net (216.187.115.221)  265.291 ms  267.265 ms  282.774 ms
23  10ge.xe-0-0-0.nyc-telx-dis-2.peer1.net (216.187.115.182)  278.996 ms  267.974 ms  271.307 ms
24  oc48-po3-0.nyc-75bre-dis-1.peer1.net (216.187.115.134)  273.482 ms  275.482 ms  277.317 ms
25  gwny01.stackoverflow.com (64.34.41.58)  292.767 ms  294.730 ms  296.702 ms

In this case, I'm four hops from singtel's local exchange (XE), nine hops to singtel's routers named after beer, 11 hops to their LA exchange (laxow), transferred by peer 1 in LA through to peer 1 in New York. Finally, the ISP in hop 25 can pass our traffic to Stack Overflow's servers. Our traffic with Stack Overflow, in this instance, travels through 25 connected networks until it arrives at Stack Overflow's servers.

Singtel is an AS for our purposes, as is peer 1.

These routes are decided by BGP between networks (so that i would connect from singtel to peer 1 LA) and IRP within an AS.

Hypothetically you COULD run your own AS, make your own peering agreements and so on, but it would be very costly

Solution 2:

Here's a very high-level view.

The Internet is basically a world-wide group of networked computers, to facilitate the huge amount of traffic transported across these networks, governments and private companies lay massive cables between countries, these major cables are the 'backbone' of the Internet. Occasionally a ship will drag an anchor over one of these and damage or possibly even break it, if this happens it can cause a major outage for a particular country.

To get connected to this backbone you need to pay fees to the owner of the cable and you need the hardware, these are major costs, on the order of hundreds of thousands, if not millions of dollars, if you personally had the money, you could connect without an ISP. Most people find it more cost efficient to pay a small monthly fee though.

Whenever you send information across the Internet, the information has a destination, a URL for example. Network equipment finds it very hard to direct a message (split into 'packets') to a text address so network equipment called routers store internal lists of URLs matched with numbered addresses, an IP address, eg: 203.35.57.110. This is called DNS (Domain Name System), there are various tiers of DNS servers, if a DNS server cannot find an IP in its own DNS it asks its 'parent'.

These addresses are usually split into ranges across countries, ie 203.xxx is Australia. Not all networks know every single IP address, they only know a very small list of addresses, enough to direct any packet that comes their way.

Example: you want to open the superuser home page.

  1. You type superuser.com into your browser and hit enter.
  2. You computer looks at it's internal DNS for superuser.com and converts it to an IP, if it doesn't find it in its DNS it asks your ISP's DNS and so on up the chain until it gets an IP (64.34.119.12).
  3. Your computer asks your ISP's router to send request packets to 64.34.119.12
  4. Your ISP's router looks at its routing tables to see where it should send your request, it sees that 64.x.x.x is outside of its local network and therefore it cannot send it directly, so it sends the request to a higher level router.
  5. This goes on and on until the packet hits a router that knows that this IP is somewhere in the US, so it then sends the packet to the closest router to the US that it knows about.
  6. The US router sees that the next number in the IP is 34, and if knows that this is on the East coast somewhere, so it sends it in that direction.
  7. Somewhere down the line a router sees that the IP is allocated to a particular ISP, so it sends it to that ISP.
  8. The router at the ISP sees that the IP is one of its own and knows exactly which machine to send it to.
  9. The receiving machine gets the request and sees that you have requested the home page, so it gets the data together and sends it back to you.
  10. The process starts all over again.

This all happens in a matter of milliseconds.

This is just a simplified general overview, the IPs can be allocated in different ways, a very large company or military organisation might own all of 125.xxx.xxx.xxx, yet a small country may only be allocated 275.24.xxx.xxx.