Why do we need subnet mask?
Solution 1:
We need a subnet mask for IPv4 addresses because the address doesn't give any information on the network size. Class sizes are not the network sizes. In practical networks all IPv4 networks are broken up into subnets that are smaller than the class size.
For example you could break up the class C network 200.200.200.0/255.255.255.0 into two smaller networks (potentially at separate locations) 200.200.200.0/255.255.255.128 and 200.200.200.128/255.255.255.128 assuming neither needed more than 126 hosts. In reality most companies only get enough IPv4 addresses for the servers that need to be on the public Internet. I've personally seen set ups with 32,16, & 8 address networks (that would be masks of 255.255.255.224, 255.255.255.240, & 255.255.255.248 respectively)
Having IP networks only in class size blocks was too restrictive in limited the number of networks that could be allowed - the 127 class A networks taking half of the space. Not to mention that having a 24 billion node network is completely unmanageable Instead in 1993 Classless Inter-Domain Routing (CIDR) was introduced to allow the networks to be split up.
Also to be clear the purpose of the subnet mask is to determine which hosts are on the local network and which are outside of the network. Hosts can talk directly to hosts on the same network, but they need to communicate with a router to talk to hosts on external networks.
Solution 2:
The 1st Octet already specify the network class (1-127: A, 128-191: B, 192-223: C etc.). A, B, or C implies the number of octets for network (respectively, 255.0.0.0, 255.255.0.0, 255.255.255.0), which automatically tells you how many hosts is allowed for each class of network.
Right, but if someone were to subnet that network, you'd need the subnet mask to know how big a subnet you were in. Yes, with classful addressing, the class tells you the size of the network and allows you to tell whether a host is in the same network as you, but if that network is subnetted, without the subnet mask, how would you know whether another node is in the same subnet as you?
Say you're on an Ethernet network. We use classful addressing with subnetting. Your IP address is 1.2.3.4
and you want to reach 1.3.1.1
. Do you use ARP to reach that address? Well, it depends on whether 1.2.3.4
and 1.3.1.1
are in the same subnet. Even if they're in the same network, if there in different subnets, a router needs to be used. If they're in the same subnet, then ARP should be used.
So you need the subnet mask if subnetting is in use, even with classful networks.
I think you're confusing subnetting with CIDR, actually. Without CIDR, even with subnetting, you don't need the subnet mask between administrative regions. But you still need it inside the network!
Solution 3:
A subnet mask is used to do a bit wise operations on an IP address, in conjunction with a network address. If my memory serves me well, you take an IP address and do a bit wise AND on it and the subnet mask for a given network. If the result equals the network address, then the IP address is on that particular network. Routers that have routing tables of network addresses and subnet masks can use simple binary maths (which is very fast, if not the fastest for computers to handle) to find out which interface to punt a packet out of.
Solution 4:
Except for @Adrian's answer I'm not sure any of these actually mention WHY we use the mask instead of some simpler to understand solution--and he only touched on the fact that masking is FAST, I mean why not just specify that you are interested in addresses 192.168.1.200-192.168.1.220, or why not just use names like *.my.address.com for this, just naming each computer instead of assigning numbers?
You actually could now to some degree completely remove numbers from routing, most PCs could handle the kinds of traffic they are sent, but there is still a problem on the larger scale devices.
Filtering like this is happening all the time, and it's happening a LOT. Masking can be done in hardware, completely eliminating the need to waste time on uninteresting packets (which used to be 99% of the packets you'd have pass through your wire, now with switched hubs you shouldn't see any that aren't addressed to your machine, again making it less relevant).
For a solution that is so easy on the hardware it is also very flexible. The same hardware can route an entire class A network (10.x.x.x) or just one or two ip addresses with the same implementation.
This is not a replacement for any of the other answers, just a little more info.
Solution 5:
"The 1st Octet already specify the network class (1-127: A, 128-191: B, 192-223: C etc.)."
There aren't many protocols in common use today that respect this anymore (see @Fiasco Labs comment - RIP is the only one I can think of). So, this statement in your question:
The IP gives all the information a subnet mask has, and more.
is not true for the great majority of protocols in use in the Internet today.
If you have a number of machines that are connected to each other, and only ever communicating with each other, with no router ever involved, then the subnet mask isn't really needed (although modern TCP/IP stacks insist you specify one).
Routers define the edges of (sub)networks. Anything needing to go through a router is on a different network - and vice versa: anything needing to go to a different network needs to go through a router.
The subnet mask is how all machines can tell whether traffic is for the current network or needs to be sent to a router to get to its destination. Your computer's TCP/IP stack will send its traffic directly to the destination if it's within the subnet mask, otherwise it consults its routing table, and the usual situation is that sends other traffic to the default gateway.