This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| en:multiasm:cs:chapter_3_11 [2026/02/08 17:33] – [Integers] ktokarz | en:multiasm:cs:chapter_3_11 [2026/02/19 20:54] (current) – jtokarz | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Fundamentals of Data Encoding, Big Endian, Little Endian ====== | ||
| + | The processor can work with different types of data. These include integers of different sizes, floating-point numbers, text, structures, and even single bits. All this data is stored in the memory as a single byte or multiple bytes. | ||
| + | ===== Integers ===== | ||
| + | Integer data types can be 8, 16, 32 or 64 bits long. If the encoded number is unsigned, it is stored in binary representation, | ||
| + | {{: | ||
| + | where n is the number of bits in a number. | ||
| + | |||
| + | In two's complement representation, | ||
| + | <table binarynumbers> | ||
| + | < | ||
| + | ^ Number of bits ^ Minimum value (hexadecimal) | ||
| + | | 8 | 0x00 | 0xFF | 0 | 255 | | ||
| + | | 8 signed | ||
| + | | 16 | 0x0000 | ||
| + | | 16 signed | ||
| + | | 32 | 0x0000 0000 | 0xFFFF FFFF | 0 | 4 294 967 295 | ||
| + | | 32 signed | ||
| + | | 64 | 0x0000 0000 0000 0000 | 0xFFFF FFFF FFFF FFFF | 0 | 18 446 744 073 709 551 615 | | ||
| + | | 64 signed | ||
| + | </ | ||
| + | |||
| + | ===== Floating point ===== | ||
| + | Integer calculations do not always cover all mathematical requirements of the algorithm. To represent real numbers, the floating-point encoding is used. A floating point is the representation of the value //A//, which is composed of three fields: | ||
| + | * Sign bit | ||
| + | * Exponent (E) | ||
| + | * Mantissa (M) | ||
| + | fulfilling the equation\\ | ||
| + | {{: | ||
| + | |||
| + | There are two main types of real numbers, called floating-point values. Single precision is a number which is encoded in 32 bits. A double-precision floating-point number is encoded with 64 bits. They are presented in Fig{{ref> | ||
| + | |||
| + | <figure realtypes> | ||
| + | {{: | ||
| + | < | ||
| + | </ | ||
| + | |||
| + | The Table{{ref> | ||
| + | |||
| + | <table realnumbers> | ||
| + | < | ||
| + | ^ Precision | ||
| + | | Single (32 bit) | 8 bits | 23 bits | {{ : | ||
| + | | Double (64 bit) | 11 bits | 52 bits | {{ : | ||
| + | </ | ||
| + | |||
| + | The most common representation for real numbers on computers is standardised in the document IEEE Standard 754. Two features have been implemented to make the calculations easier for computers: | ||
| + | * the Biased exponent, | ||
| + | * the Normalised Mantissa. | ||
| + | A biased exponent means that the bias value is added to the real exponent value. This results in all positive exponents, which makes it easier to compare numbers. | ||
| + | The normalised mantissa is adjusted to have only one bit of the value " | ||
| + | |||
| + | |||
| + | |||
| + | ===== Texts ===== | ||
| + | Texts are represented as a series of characters. In modern operating systems, texts are encoded using two-byte Unicode, which is capable of encoding not only 26 basic letters but also language-specific characters of many different languages. In simpler computers, like in embedded systems, 8-bit ASCII codes are often used. Every byte of the text representation in the memory contains a single ASCII code of the character. It is quite common in assembler programs to use the zero value (NULL) as the end character of the string, similar to the C/C++ null-terminated string convention. | ||
| + | |||
| + | ===== Endianness ===== | ||
| + | Data encoded in memory must be compatible with the processor. Memory chips are usually organised as a sequence of bytes, which means that every byte can be individually addressed. For processors of the class higher than 8-bit, there appears to be an issue with the byte order in bigger data types. There are two possibilities: | ||
| + | - Little Endian - low-order byte is stored at a lower address in the memory. | ||
| + | - Big Endian - the high-order byte is stored at a lower address in the memory. | ||
| + | |||
| + | These two methods for a 32-bit class processor are shown in Fig{{ref> | ||
| + | |||
| + | <figure littlebigendian> | ||
| + | {{ : | ||
| + | < | ||
| + | </ | ||