Difference between revisions of "Byte"
Gonzalo58T (Talk | contribs) (Expanded and made more exact) |
GertieUbpgdd (Talk | contribs) |
||
(6 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | A byte represents (most often) 8 (can be 10, 12, 15... depending on | + | A byte represents (most often) 8 (can be 10, 12, 15... depending on the architecture) bits of data. A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides. |
== Representation == | == Representation == | ||
− | 8-bits bytes are typically described by two [ | + | 8-bits bytes are typically described by two [[Assembly_Basics#Counting|hexadecimal]] characters. |
− | Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15. | + | Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15. |
== How to read an hexadecimal number, and a byte == | == How to read an hexadecimal number, and a byte == | ||
− | + | Taking a random byte: 11010011 | |
− | + | ||
− | + | ||
− | + | ||
In hexadecimal, it is: D3 | In hexadecimal, it is: D3 | ||
Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16". | Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16". | ||
− | + | This means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 16<sup>1</sup> (equals 16), third by 16<sup>2</sup>, n<sup>th</sup> by 16<sup>n+1</sup>... | |
* 3 in hexadecimal is 3 in decimal, too (how surprising!) | * 3 in hexadecimal is 3 in decimal, too (how surprising!) | ||
* D in hexadecimal is 13 in decimal, too. | * D in hexadecimal is 13 in decimal, too. | ||
− | + | To convert this number to base 10, the following calculations are used: | |
3 * 1 + 13 * 16 = 3 + 208 = 211 | 3 * 1 + 13 * 16 = 3 + 208 = 211 | ||
+ | |||
+ | = Computer data storage = | ||
+ | |||
+ | Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises, which byte shall be stored first? | ||
+ | |||
+ | There are (mainly) two answers to this. | ||
+ | |||
+ | |||
+ | == Big-endianness == | ||
+ | |||
+ | The naive, instinctive and natural storage order is to store the number just as most english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost. | ||
+ | |||
+ | To store 0XDEADBEEF in memory, it will be stored like so: | ||
+ | <pre> | ||
+ | Memory slots: [1 |2 |3 |4 ] | ||
+ | Bytes: [DE|AD|BE|EF] | ||
+ | </pre> | ||
+ | |||
+ | This has good advantages for networking: no matter the atomic transmission word size, the message will not change its signature, it will stay in the same order. | ||
+ | |||
+ | However this has implications, like, to determine parity, the 4th memory slot has to be read. Some people invented another way to store data. | ||
+ | |||
+ | Processors such as PowerPC, Motorola 68x, and others are big-endian. The human brain is big-endian too. | ||
+ | == Little-endianness == | ||
+ | |||
+ | Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost. | ||
+ | |||
+ | Taking 0XDEADBEEF, it will be stored in memory as | ||
+ | <pre> | ||
+ | Memory slots: [1 |2 |3 |4 ] | ||
+ | Bytes: [EF|EB|AD|DE] | ||
+ | </pre> | ||
+ | if the memory has 1 byte slots. | ||
+ | |||
+ | If the memory or whatever storage/transmission medium has a word size bigger than 1 byte, then the data is stored with the least significant word first, that is if 2-byte words are being used: | ||
+ | <pre> | ||
+ | Memory slots: [1 |2 |3 |4 ] | ||
+ | Words: [1 |2 ] | ||
+ | Bytes: [BE EF|DE AD] | ||
+ | </pre> | ||
+ | |||
+ | Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Further information can be found by googling "the NUXI problem" to understand the problem. When decoding data, the position of the data has to be taken into account to ensure accurate decryption. | ||
+ | |||
+ | You cannot understand little-endian data if you read it with a big-endian point of view, thus for this reason, each endian form must be taken from the correct point of view. | ||
= Reference = | = Reference = | ||
Line 31: | Line 72: | ||
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ||
</pre> | </pre> | ||
− |
Latest revision as of 03:38, 16 May 2012
A byte represents (most often) 8 (can be 10, 12, 15... depending on the architecture) bits of data. A bit of data is simply a 1 or 0, that may represent an "off" or "on" state, or whatever the programmer decides.
Contents
Representation
8-bits bytes are typically described by two hexadecimal characters.
Hexadecimal is often notated with character ranges from 0 to 9 and then from A F. A represents "10", B 11, C 12, D 13, E 14, F 15.
How to read an hexadecimal number, and a byte
Taking a random byte: 11010011 In hexadecimal, it is: D3
Hexadecimal, means (that's greek) "radix 16", or more commonly "base 16".
This means, reading from right to left, the first ranking number will be multipled by 16^0 (equals 1), second ranking number will be multipled by 161 (equals 16), third by 162, nth by 16n+1...
- 3 in hexadecimal is 3 in decimal, too (how surprising!)
- D in hexadecimal is 13 in decimal, too.
To convert this number to base 10, the following calculations are used:
3 * 1 + 13 * 16 = 3 + 208 = 211
Computer data storage
Bytes are a unity of storage. However, in most processors, data is stored as "words", which are constituted from 2 or more bytes. When storing data, a question arises, which byte shall be stored first?
There are (mainly) two answers to this.
Big-endianness
The naive, instinctive and natural storage order is to store the number just as most english-speaking humans read it: most significant (heaviest) byte first, smallest byte rightmost.
To store 0XDEADBEEF in memory, it will be stored like so:
Memory slots: [1 |2 |3 |4 ] Bytes: [DE|AD|BE|EF]
This has good advantages for networking: no matter the atomic transmission word size, the message will not change its signature, it will stay in the same order.
However this has implications, like, to determine parity, the 4th memory slot has to be read. Some people invented another way to store data.
Processors such as PowerPC, Motorola 68x, and others are big-endian. The human brain is big-endian too.
Little-endianness
Little endianness is commonly found on x86 processors. In little endianness the least-significant byte is stored leftmost in memory, then the more significant... to the most significant byte, staying rightmost.
Taking 0XDEADBEEF, it will be stored in memory as
Memory slots: [1 |2 |3 |4 ] Bytes: [EF|EB|AD|DE]
if the memory has 1 byte slots.
If the memory or whatever storage/transmission medium has a word size bigger than 1 byte, then the data is stored with the least significant word first, that is if 2-byte words are being used:
Memory slots: [1 |2 |3 |4 ] Words: [1 |2 ] Bytes: [BE EF|DE AD]
Since most machines are at least 16 bits, this poses big problem when communicating from little- to big-endian and vice versa. Further information can be found by googling "the NUXI problem" to understand the problem. When decoding data, the position of the data has to be taken into account to ensure accurate decryption.
You cannot understand little-endian data if you read it with a big-endian point of view, thus for this reason, each endian form must be taken from the correct point of view.
Reference
0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15