When the internet started out, our routers only understood classful routes. A classful route would be something like 10.0.0.0/8, or 10.1.0.0/16, 10.1.1.0/24. Respectively, those are called a Class A, B and C network. You may be asking, "what's with that / and the number?". That's called a CIDR. A CIDR defines how many significant bits there are in a network. Meaning with a /8 only the first 8 out of 32 bits are significant. An IPv4 address is made up of 32 bits total. 8 bits per octet, hence why an octet is called an octet. The representation of those bits per octet in an IP address is found here: 11111111.11111111.11111111.11111111 So when you say /8, that means only the first octet defines the network or size of the network depending on how you look at it.
Subnetting is hard, but once you understand it you feel dumb as hell for it being so hard to understand. The /8 is the same thing as a subnet mask for those of you that knows what a subnetmask is. It is equivelant to 255.0.0.0. So, a /4 would be 127.0.0.0, a /16 is 255.255.0.0. If you look at the binary representations of these CIDR masks, you can see how it works pretty easily. /16 is 11111111.11111111.00000000.00000000 /8 is 11111111.00000000.00000000.00000000 /4 is 11110000.00000000.00000000.00000000 and so on.
To simplify, when the internet came about, you had /8, /16, and /24. Those were the only network sizes you could use. These all coorelate to numbers. /8 gives you something like 17 million ip addresses in a network /16 gives you 65536 addresses in a network /24 gives you 256 addresses in a network Remember, these are not all usable addresses. The first .0 in a network is the network identifier, the last .255 in a network is the broadcast address and the router typically requires an address as well. Generally, take 3 addresses off the size of a network, and that's how many usable IP addresses the network has.
Real World Examples
Let's say the year is 1990, a comp they have 23 network devices. The company applies for ip address space knowing that they will never have more than 23 network devices. Since in 1990 classless routing was not created yet, they would be assigned a classful class C network, also known as 255.255.255.0 or /24. That gives them 253 usable addresses. As you can see, that's a huge waste of address space
Another example, a bit more drastic is that a company has 280 network devices. They apply for IP space. Way back in the say, they would have been assigned a class B network. That's 65532 usable addresses. Big big waste. The reason classful networks were used to begin with is netadmins didn't want to have to remember confusing subnet masks like 255.255.224.0.
Around 1995ish, all of these network guys realized that they were wasting a ton of IP space. By assigning these unnecessarily large netblocks to people who only had a few addresses, a few things came around, one of the most useful being rfc1918. This was a nightmare but also a blessing at the same time. rfc1918 introduced the idea of private unroutable address space. That would be the oh so familiar 192.168.0.0/16 block, the probably somewhat familliar 10.0.0.0/8 block and the less common 172.16.0.0/12 block. The latter being the prettiest in my opinion. All of these corperations started using those block to put their devices. This had an extra security advantage because the devices couldn't be accessed from the internet. This posed a new problem though. Once all of your devices are on this block, how do they get out to the internet? And with this question, NAT was created. NAT stands for Network Address Translation.
Back to Subnetting
CIDR and Classless subnetting were created to alleviate the issue of overallocation. CIDR stands for Classless InterDomain Routing. Basically, what CIDR does is simplify the confusing netmasks down into single numbers.
Going back to the previous example, an organization in 2009 has 23 network devices. They apply for IP space and ARIN (or whoever their overseeing numbers registry) assigns them a /27. In binary, a /27 is represented like this: 11111111.11111111.11111111.11100000 You can find the binary representation by basically taking 32 bits: 11111111.11111111.11111111.11111111 Now if you have a /27, you do 32-27 and that gives you 5. Take the last 5 (first 5) numbers of that binary representation and turn them into zeroes. So you get this: 11111111.11111111.11111111.11100000
If you know how to count in binary, it's fairly simple to find out how many addresses are in that network. Take your binary representation of the network (11111111.11111111.11111111.11100000) and inverse it, so you'll get 00000000.00000000.00000000.00011111. Then, you can see that this network only spans 1 octet, the last octet. You then take that binary number and convert it to decimal. 16+8+4+2+1=31 or if you include 0 as a number, it's 32.
/32's are useful for blackholing like a single ip for interface aliases /31 has 1+1=2 addresses, 1 for the subnet itself and one for broadcast /30's are useful for point to point communications http://www.oav.net/mirrors/cidr.html <- this will be your friend throughout your networking career