Friday, April 15, 2005

Some "Base"-ics

In order to understand computers, some knowledge of math is, fortunately or unfortunately, required. You would think that computer designers would have hidden alot of the math from the people who use computers by now, but that is not the case. We have "32 bit operating systems" and "256 Megabytes of memory". In order to understand these terms and others, a "bit" (pardon the pun) of math knowledge is required.

If you have fifteen apples in a box, you have just that -- an absolute quantity of fifteen apples. However, there are a number of ways to write down the quantity "fifteen" on paper. You could write "15 apples." You could use Roman numerals and write "XV apples." You could do what I did and type out the words "fifteen apples." If you know that a standard "box of apples" always contains fifteen apples, you could say "one box of apples," just like a "dozen eggs" is always twelve eggs.

The point is that no matter how you represent the quantity of apples, you always have the same absolute number of apples: fifteen.

As noted above, we can use a numeral or symbol to represent the quantity "fifteen" in a number of ways. The numeral system we usually use for this is base ten, or "decimal" -- "15" means "fifteen" in base ten. What does base ten mean? It means that, going from right to left, each position in the numeral represents a quantity ten times the quantity of the previous position. In the case of "15", we have five "ones", and one "ten" -- "ten" is ten times one. Some examples:

234 = four "ones" plus three "tens" plus two "hundreds." One hundred is ten times ten.

4,567 = seven "ones" plus six "tens" plus five "hundreds" plus four "thousands." One thousand is ten times one hundred.

30,892 = two "ones" plus nine "tens" plus eight "hundreds" plus zero "thousands" plus 3 "ten thousands."

However, computers don't do well in the base ten world. Underneath it all, computers recognize only two things: "On" or "Off." Something is there, or not there. Computers are essentially a huge collection of switches that can either be on or off. Everything in a computer is represented by a series of "ons" and "offs." For example, in many computers the letter "A" is represented as "off" "on" "off" "off" "off" "off" "off" "on." The number "fifteen" is represented as "on" "on" "on" "on".

We as humans working with computers can't communicate to one another about computers easily saying lots of "ons" and "offs." That's cumbersome. Since there are basically only two positions for computer switches, computer designers chose to use the "base two" or "binary" numeral system to represent information on the computer. In the binary system, each position in the numeral is two times the previous position, not ten. So, you can only have two values in each position: a "1", which is "on," and a "0", which is "off." This makes it a bit easier to represent the letter "A" as "0100 0001" and the number fifteen as "1111".

"1111" is broken down as one "one," plus one "two" (two times one), plus one "four" (two times two), plus one "eight" (two times four) = fifteen.

Since computers can only work with "ons" and "offs", letters and other symbols must be encoded in binary as well. So, the letter "A" is "0100 0001" or sixty-five. Why sixty-five? It could have been anything, but in order to preserve sanity in the world of computers, the American National Standards Institute came up with the "American Standard Code for Information Interchange" or "ASCII" (pronounced "Ask-key) to provide a standard for representing letters, numerals, and symbols in binary. "A" ended up being sixty-five, or "0100 0001" in binary. "0100 0001" means "one "sixty-four" plus one "one."

Regarding the terms "bit" and "byte," "Bit" means "Binary Digit," and a "byte" is eight "bits." (Get it? Eight bits make a "byte" -- computer people love their puns. Half a byte, or four bits, makes a "nybble." I am not making this up). The number fifteen can be represented in four bits, and all of the bits are 1. The letter "A" is represented in eight bits, and the first and seventh bits (starting from the right) are 1.

A "byte," being eight bits, gives you two hundred fifty-six possible values (including zero), which, in the early days of computing, was deemed to be enough. Because of this, a byte is often the smallest piece of data you can work with on a computer.

One more thing, and then we are done for this lesson. As you can see, binary numbers can be quite long. In the example above, 1,048,576 is a very long 1 0000 0000 0000 0000 0000 in binary. Binary numbers quite quickly get so long that they get perspective. To make this managable, computer scientists invented the "Hexadecimal" (or "Hex") numeral system -- base sixteen. Instead of each digit from right to left being ten (decimal) or two (binary) times the previous position, each digit is sixteen times the previous digit. Counting from 0 to "10" in Hex looks like:

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, 10

What is with the A, B, etc? Well, this is base sixteen, so there needs to be sixteen potential values for each digit. Rather than get fancy creating new symbols for ten, eleven, twelve, thirteen, fourteen, and fifteen, computer scientists, being engineers and not marketing types, chose to use the letters A-F.

So, 10 hex is sixteen, and 100 hex is 256 decimal (sixteen times sixteen), and 1000 hex is 4,096 decimal.

How on earth does this make computers "easier" (in a computer scientist-sort of way)?

Because a byte, which is eight binary digits, can be represented evenly in two hexadecimal digits. Fifteen is "15" decimal, "F" hex, and "1111" binary. Add one, and you get: "16" decimal, "10" hex, and "0001 0000" binary. A byte that is "all on" looks like: "1111 1111" binary and "FF" hex. Add one, and you get "0001 0000 0000" binary, and "100" hex. So, one hex digit represents four bits.

So, when you look at a number in binary like 1 0000 0000 0000 0000 0000, the number is easily turned into hex: 100000 hex. Likewise, a binary number like "0010 0110 1110" is "26E" in hex -- "0010" is "2," "0110" is "6," and "1110" is fourteen, or "E."

To differentiate between a hex number and a decimal number, hex numbers are sometimes preceded with a zero and small "x" -- 0x100 is 100 hex, and 100 is 100 decimal. So, "26E" is sometimes written as "0x26E".

Okay -- take a breath.

To recap:

Computers are essentially a large, large number of switches. These switches can either be "on" or "off." To represent these switches, computer scientists use the binary numbering system, where a "1" is "on" and a "0" is "off." Each digit in a binary number is called a "bit." A collection of eight bits is a "byte." To easily work with "bits" and "bytes" the hexadecimal numbering system is used, in which one hexidecimal digit represents evenly four binary digits.

The practical uses of this are all over computing. Everything, and I mean everything, comes down, eventually, to bits and bytes. Network addresses, encryption keys, settings for equipment, are all, at core, binary. As we move forward with newer and faster computers, your day-to-day activities will not always bring you face to face with, say "0xFF", but these values are there nonetheless, and if they do come into view, as they do with wireless networking, it is important to know about it.

1 comments:

James P said...

And for the matter of that, "A" is also represented as ".-" (read: dit-dah).

.. _ .. ... ... _ .. ._.. ._.. ._ .-.-.-


Morse is about as "obsolete" as binary, but it's still used when all else fails.