Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "Byte"

From NetSec
Jump to: navigation, search
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
A byte represents (most often) 8 (can be 10, 12, 15... depending on your architecture) bits of data.  A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides.
+
A byte represents (most often) 8 (can be 10, 12, 15... depending on the architecture) bits of data.  A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides.
  
 
== Representation ==
 
== Representation ==
Line 8: Line 8:
 
== How to read an hexadecimal number, and a byte ==
 
== How to read an hexadecimal number, and a byte ==
  
 
+
Taking a random byte: 11010011
Let's take a random byte: 11010011
+
 
+
 
In hexadecimal, it is: D3
 
In hexadecimal, it is: D3
  
 
Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16".
 
Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16".
  
So this means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 16<sup>1</sup> (equals 16), third by 16<sup>2</sup>, n<sup>th</sup> by 16<sup>n+1</sup>...
+
This means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 16<sup>1</sup> (equals 16), third by 16<sup>2</sup>, n<sup>th</sup> by 16<sup>n+1</sup>...
  
 
* 3 in hexadecimal is 3 in decimal, too (how surprising!)
 
* 3 in hexadecimal is 3 in decimal, too (how surprising!)
 
* D in hexadecimal is 13 in decimal, too.
 
* D in hexadecimal is 13 in decimal, too.
  
So to convert this number to base 10, we compute:
+
To convert this number to base 10, the following calculations are used:
  
 
3 * 1 + 13 * 16 = 3 + 208 = 211
 
3 * 1 + 13 * 16 = 3 + 208 = 211
Line 27: Line 25:
 
= Computer data storage =
 
= Computer data storage =
  
Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises then. Which byte shall be stored first?
+
Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises, which byte shall be stored first?
  
 
There are (mainly) two answers to this.
 
There are (mainly) two answers to this.
Line 34: Line 32:
 
== Big-endianness ==
 
== Big-endianness ==
  
The naive, instinctive and natural storage order is to store the number just as most of we english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost.
+
The naive, instinctive and natural storage order is to store the number just as most english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost.
  
That is if we are to store 0XDEADBEEF in memory we will store it like
+
To store 0XDEADBEEF in memory, it will be stored like so:
 
<pre>
 
<pre>
 
Memory slots:  [1 |2 |3 |4 ]
 
Memory slots:  [1 |2 |3 |4 ]
Line 42: Line 40:
 
</pre>
 
</pre>
  
This has good advantages for networking: no matter our atomic transmission word size, the message will not change its signature, it will stay in the same order.
+
This has good advantages for networking: no matter the atomic transmission word size, the message will not change its signature, it will stay in the same order.
  
However this has implications, like, to determine parity, we have to read the 4th memory slot. Some people invented another way to store data.
+
However this has implications, like, to determine parity, the 4th memory slot has to be read. Some people invented another way to store data.
  
Processors such as PowerPC, Motorola 68x, and others are big-endian. Your brain is big-endian too.
+
Processors such as PowerPC, Motorola 68x, and others are big-endian. The human brain is big-endian too.
 
== Little-endianness ==
 
== Little-endianness ==
  
 
Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost.
 
Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost.
  
If we take 0XDEADBEEF, it will be stored in memory as  
+
Taking 0XDEADBEEF, it will be stored in memory as
 
<pre>
 
<pre>
 
Memory slots:  [1 |2 |3 |4 ]
 
Memory slots:  [1 |2 |3 |4 ]
 
Bytes:          [EF|EB|AD|DE]
 
Bytes:          [EF|EB|AD|DE]
 
</pre>
 
</pre>
if our memory has 1 byte slots.
+
if the memory has 1 byte slots.
  
If our memory or whatever storage/transmission medium we have has a word size bigger than 1 byte, then, we store data with the least significant word first, that is if we have 2-bytes words:
+
If the memory or whatever storage/transmission medium has a word size bigger than 1 byte, then the data is stored with the least significant word first, that is if 2-byte words are being used:
 
<pre>
 
<pre>
 
Memory slots:  [1 |2 |3 |4 ]
 
Memory slots:  [1 |2 |3 |4 ]
Line 65: Line 63:
 
</pre>
 
</pre>
  
Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Google on "the NUXI problem" to understand what I mean. When decoding data, we have therefore to take in account where it's coming from, so to get it right.
+
Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Further information can be found by googling "the NUXI problem" to understand the problem. When decoding data, the position of the data has to be taken into account to ensure accurate decryption.  
  
Just as you cannot try and decipher arabic properly if you read it from left to right, you cannot understand little-endian data if you read it with a big-endian point of view.
+
You cannot understand little-endian data if you read it with a big-endian point of view, thus for this reason, each endian form must be taken from the correct point of view.
  
 
= Reference =
 
= Reference =
Line 74: Line 72:
 
   0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15
 
   0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15
 
</pre>
 
</pre>
[[Category:Information]]
 

Latest revision as of 03:38, 16 May 2012

A byte represents (most often) 8 (can be 10, 12, 15... depending on the architecture) bits of data. A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides.

Representation

8-bits bytes are typically described by two hexadecimal characters.

Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15.

How to read an hexadecimal number, and a byte

Taking a random byte: 11010011 In hexadecimal, it is: D3

Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16".

This means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 161 (equals 16), third by 162, nth by 16n+1...

  • 3 in hexadecimal is 3 in decimal, too (how surprising!)
  • D in hexadecimal is 13 in decimal, too.

To convert this number to base 10, the following calculations are used:

3 * 1 + 13 * 16 = 3 + 208 = 211


Computer data storage

Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises, which byte shall be stored first?

There are (mainly) two answers to this.


Big-endianness

The naive, instinctive and natural storage order is to store the number just as most english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost.

To store 0XDEADBEEF in memory, it will be stored like so:

Memory slots:   [1 |2 |3 |4 ]
Bytes:          [DE|AD|BE|EF]

This has good advantages for networking: no matter the atomic transmission word size, the message will not change its signature, it will stay in the same order.

However this has implications, like, to determine parity, the 4th memory slot has to be read. Some people invented another way to store data.

Processors such as PowerPC, Motorola 68x, and others are big-endian. The human brain is big-endian too.

Little-endianness

Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost.

Taking 0XDEADBEEF, it will be stored in memory as

Memory slots:   [1 |2 |3 |4 ]
Bytes:          [EF|EB|AD|DE]

if the memory has 1 byte slots.

If the memory or whatever storage/transmission medium has a word size bigger than 1 byte, then the data is stored with the least significant word first, that is if 2-byte words are being used:

Memory slots:   [1 |2 |3 |4 ]
Words:          [1    |2    ]
Bytes:          [BE EF|DE AD]

Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Further information can be found by googling "the NUXI problem" to understand the problem. When decoding data, the position of the data has to be taken into account to ensure accurate decryption.

You cannot understand little-endian data if you read it with a big-endian point of view, thus for this reason, each endian form must be taken from the correct point of view.

Reference

  0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
  0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15