Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "Byte"

From NetSec
Jump to: navigation, search
 
 
(7 intermediate revisions by 4 users not shown)
Line 1: Line 1:
A byte represents 8 bits of data.  A bit of data is simply a 1 or 0.  Bytes are typically described by two [http://wiki.dotslashproductions.net/Assembly_Basics#Counting hexadecimal] characters.
+
A byte represents (most often) 8 (can be 10, 12, 15... depending on the architecture) bits of data.  A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides.
  
Their values go 0-9 and A-F.  A represents ten. So when you see FF:
+
== Representation ==
 +
8-bits bytes are typically described by two [[Assembly_Basics#Counting|hexadecimal]] characters.
  
F is fifteen.  So you multiply 15 * 16 because it is in the 16's placeholder.
+
Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15.
This gives you the value 240.  You add that to the second F (15) to get 255.
+
  
This is the highest value of a byte.  As such a byte can only hold integers 0-255, which is 256 values because it starts at zero.
+
== How to read an hexadecimal number, and a byte ==
  
= Reference =
+
Taking a random byte: 11010011
 +
In hexadecimal, it is: D3
  
 +
Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16".
 +
 +
This means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 16<sup>1</sup> (equals 16), third by 16<sup>2</sup>, n<sup>th</sup> by 16<sup>n+1</sup>...
 +
 +
* 3 in hexadecimal is 3 in decimal, too (how surprising!)
 +
* D in hexadecimal is 13 in decimal, too.
 +
 +
To convert this number to base 10, the following calculations are used:
 +
 +
3 * 1 + 13 * 16 = 3 + 208 = 211
 +
 +
 +
= Computer data storage =
 +
 +
Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises, which byte shall be stored first?
 +
 +
There are (mainly) two answers to this.
 +
 +
 +
== Big-endianness ==
 +
 +
The naive, instinctive and natural storage order is to store the number just as most english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost.
 +
 +
To store 0XDEADBEEF in memory, it will be stored like so:
 +
<pre>
 +
Memory slots:  [1 |2 |3 |4 ]
 +
Bytes:          [DE|AD|BE|EF]
 +
</pre>
 +
 +
This has good advantages for networking: no matter the atomic transmission word size, the message will not change its signature, it will stay in the same order.
 +
 +
However this has implications, like, to determine parity, the 4th memory slot has to be read. Some people invented another way to store data.
 +
 +
Processors such as PowerPC, Motorola 68x, and others are big-endian. The human brain is big-endian too.
 +
== Little-endianness ==
 +
 +
Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost.
 +
 +
Taking 0XDEADBEEF, it will be stored in memory as
 +
<pre>
 +
Memory slots:  [1 |2 |3 |4 ]
 +
Bytes:          [EF|EB|AD|DE]
 +
</pre>
 +
if the memory has 1 byte slots.
 +
 +
If the memory or whatever storage/transmission medium has a word size bigger than 1 byte, then the data is stored with the least significant word first, that is if 2-byte words are being used:
 +
<pre>
 +
Memory slots:  [1 |2 |3 |4 ]
 +
Words:          [1    |2    ]
 +
Bytes:          [BE EF|DE AD]
 +
</pre>
 +
 +
Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Further information can be found by googling "the NUXI problem" to understand the problem. When decoding data, the position of the data has to be taken into account to ensure accurate decryption.
 +
 +
You cannot understand little-endian data if you read it with a big-endian point of view, thus for this reason, each endian form must be taken from the correct point of view.
 +
 +
= Reference =
 +
<pre>
 
   0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
 
   0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
 
   0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15
 
   0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15
 
+
</pre>
[[Category:Information]]
+

Latest revision as of 04:38, 16 May 2012

A byte represents (most often) 8 (can be 10, 12, 15... depending on the architecture) bits of data. A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides.

Representation

8-bits bytes are typically described by two hexadecimal characters.

Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15.

How to read an hexadecimal number, and a byte

Taking a random byte: 11010011 In hexadecimal, it is: D3

Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16".

This means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 161 (equals 16), third by 162, nth by 16n+1...

  • 3 in hexadecimal is 3 in decimal, too (how surprising!)
  • D in hexadecimal is 13 in decimal, too.

To convert this number to base 10, the following calculations are used:

3 * 1 + 13 * 16 = 3 + 208 = 211


Computer data storage

Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises, which byte shall be stored first?

There are (mainly) two answers to this.


Big-endianness

The naive, instinctive and natural storage order is to store the number just as most english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost.

To store 0XDEADBEEF in memory, it will be stored like so:

Memory slots:   [1 |2 |3 |4 ]
Bytes:          [DE|AD|BE|EF]

This has good advantages for networking: no matter the atomic transmission word size, the message will not change its signature, it will stay in the same order.

However this has implications, like, to determine parity, the 4th memory slot has to be read. Some people invented another way to store data.

Processors such as PowerPC, Motorola 68x, and others are big-endian. The human brain is big-endian too.

Little-endianness

Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost.

Taking 0XDEADBEEF, it will be stored in memory as

Memory slots:   [1 |2 |3 |4 ]
Bytes:          [EF|EB|AD|DE]

if the memory has 1 byte slots.

If the memory or whatever storage/transmission medium has a word size bigger than 1 byte, then the data is stored with the least significant word first, that is if 2-byte words are being used:

Memory slots:   [1 |2 |3 |4 ]
Words:          [1    |2    ]
Bytes:          [BE EF|DE AD]

Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Further information can be found by googling "the NUXI problem" to understand the problem. When decoding data, the position of the data has to be taken into account to ensure accurate decryption.

You cannot understand little-endian data if you read it with a big-endian point of view, thus for this reason, each endian form must be taken from the correct point of view.

Reference

  0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
  0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15