Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "Byte"

From NetSec
Jump to: navigation, search
(Little- and big-endianness)
(Representation)
Line 2: Line 2:
  
 
== Representation ==
 
== Representation ==
8-bits bytes are typically described by two [http://wiki.dotslashproductions.net/Assembly_Basics#Counting hexadecimal] characters.
+
8-bits bytes are typically described by two [[Assembly_Basics#Counting|hexadecimal]] characters.
  
Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15.  
+
Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15.
  
 
== How to read an hexadecimal number, and a byte ==
 
== How to read an hexadecimal number, and a byte ==

Revision as of 15:34, 16 November 2011

A byte represents (most often) 8 (can be 10, 12, 15... depending on your architecture) bits of data. A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides.

Representation

8-bits bytes are typically described by two hexadecimal characters.

Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15.

How to read an hexadecimal number, and a byte

Savitri says
Just like decimal numbers, you epsilons!

Let's take a random byte: 11010011

In hexadecimal, it is: D3

Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16".

So this means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 161 (equals 16), third by 162, nth by 16n+1...

  • 3 in hexadecimal is 3 in decimal, too (how surprising!)
  • D in hexadecimal is 13 in decimal, too.

So to convert this number to base 10, we compute:

3 * 1 + 13 * 16 = 3 + 208 = 211


Computer data storage

Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises then. Which byte shall be stored first?

There are (mainly) two answers to this.

Savitri says
We will not get into the Middle-Endianness hell. Middle-E* problems are always neverending. Ask Mahmoud Abbas.

Big-endianness

The naive, instinctive and natural storage order is to store the number just as most of we english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost.

That is if we are to store 0XDEADBEEF in memory we will store it like

Memory slots:   [1 |2 |3 |4 ]
Bytes:          [DE|AD|BE|EF]

This has good advantages for networking: no matter our atomic transmission word size, the message will not change its signature, it will stay in the same order.

However this has implications, like, to determine parity, we have to read the 4th memory slot. Some people invented another way to store data.

Processors such as PowerPC, Motorola 68x, and others are big-endian. Your brain is big-endian too.

Little-endianness

Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost.

If we take 0XDEADBEEF, it will be stored in memory as

Memory slots:   [1 |2 |3 |4 ]
Bytes:          [EF|EB|AD|DE]

if our memory has 1 byte slots.

If our memory or whatever storage/transmission medium we have has a word size bigger than 1 byte, then, we store data with the least significant word first, that is if we have 2-bytes words:

Memory slots:   [1 |2 |3 |4 ]
Words:          [1    |2    ]
Bytes:          [BE EF|DE AD]

Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Google on "the NUXI problem" to understand what I mean. When decoding data, we have therefore to take in account where it's coming from, so to get it right.

Just as you cannot try and decipher arabic properly if you read it from left to right, you cannot understand little-endian data if you read it with a big-endian point of view.

Reference

  0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
  0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15