mirror of
https://github.com/pkivolowitz/asm_book.git
synced 2026-06-22 13:16:50 +08:00
silly change of a word
This commit is contained in:
parent
e1a26d75bd
commit
3831e88e53
1 changed files with 64 additions and 52 deletions
|
|
@ -13,9 +13,10 @@ the changed value might be stored from a register back to RAM.
|
||||||
|
|
||||||
## Loading Data From RAM into Registers
|
## Loading Data From RAM into Registers
|
||||||
|
|
||||||
The instructions used to retrieve information from memory are `ldr` and
|
The instructions typically used to retrieve information from memory are
|
||||||
`ldp`. The characters `ld` in these mnemonics bring to mind `load`.
|
`ldr` and `ldp`. The characters `ld` in these mnemonics bring to mind
|
||||||
`ldr` is "load a register" while `ldp` is "load a pair of registers".
|
`load`. `ldr` is "load a register (from RAM)" while `ldp` is "load a
|
||||||
|
pair of registers (from RAM)".
|
||||||
|
|
||||||
Both of these instructions possess many variations, only a few of which
|
Both of these instructions possess many variations, only a few of which
|
||||||
will be described here. Common to all variations of the `ldr` and `ldp`
|
will be described here. Common to all variations of the `ldr` and `ldp`
|
||||||
|
|
@ -40,10 +41,12 @@ Similarly,
|
||||||
|
|
||||||
loads a *pair* of registers from RAM at the address specified by the
|
loads a *pair* of registers from RAM at the address specified by the
|
||||||
stack pointer. Any `x` register could also have been used, the `sp` is
|
stack pointer. Any `x` register could also have been used, the `sp` is
|
||||||
shown here to demonstrate that it too can be used.
|
shown here to demonstrate that it too can be used. In fact, loads
|
||||||
|
relative to the stack pointer is how compilers implement local
|
||||||
|
variables.
|
||||||
|
|
||||||
What goes inside the `[]` is always a pointer so must be a 64 bit wide
|
What goes inside the `[]` is always a pointer so must be a 64 bit wide
|
||||||
animal such as any `x` register or the stack *pointer*.
|
entity such as any `x` register or the stack *pointer*.
|
||||||
|
|
||||||
## Offsets
|
## Offsets
|
||||||
|
|
||||||
|
|
@ -75,19 +78,22 @@ register either before the actual fetch or after.
|
||||||
|
|
||||||
Assume `ptr` is a pointer to a `long`:
|
Assume `ptr` is a pointer to a `long`:
|
||||||
|
|
||||||
* Line 2 corresponds to: `*(ptr++)`.
|
* Line 2 can correspond to: `*(ptr++)`.
|
||||||
* Line 3 corresponds to: `*(++ptr)`.
|
|
||||||
|
* Line 3 can correspond to: `*(++ptr)`.
|
||||||
|
|
||||||
Note this is for illustration only in that the `++` syntax in C and C++
|
Note this is for illustration only in that the `++` syntax in C and C++
|
||||||
increment by 1. In lines 2 and 3, `#simm` can have values other than 1
|
increment by 1. In lines 2 and 3, `#simm` can have values other than 1
|
||||||
including negative values for decrements.
|
including negative values for decrements.
|
||||||
|
|
||||||
Also note that when used with the stack pointer `sp`, `#simm` must be a
|
Also note that when used with the stack pointer `sp`, `#simm` must be a
|
||||||
multiple of 16.
|
multiple of 16. That is, all modifications to the stack pointer must be
|
||||||
|
in multiples of 16.
|
||||||
|
|
||||||
Concerning the restrictions placed on the offsets:
|
Concerning the restrictions placed on the offsets:
|
||||||
|
|
||||||
* `simm` can be in the range of -256 to 255 (10 byte signed value).
|
* `simm` can be in the range of -256 to 255 (10 byte signed value).
|
||||||
|
|
||||||
* `pimm` can be in the range of 0 to 32760 in multiples of 8.
|
* `pimm` can be in the range of 0 to 32760 in multiples of 8.
|
||||||
|
|
||||||
`w` registers are used for `int`, `short` and `char`. When working with
|
`w` registers are used for `int`, `short` and `char`. When working with
|
||||||
|
|
@ -111,6 +117,9 @@ The AARCH64 ISA includes an even more exotic means of performing mass
|
||||||
calculation called SVE. We will probably never cover AVE as no generally
|
calculation called SVE. We will probably never cover AVE as no generally
|
||||||
available processor implements it. This includes Apple Silicon.
|
available processor implements it. This includes Apple Silicon.
|
||||||
|
|
||||||
|
I wonder how sad this makes certain engineers at ARM. After all, imagine
|
||||||
|
if YOU threw an ISA and nobody came.
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
### Loading (Storing) Various Sizes of Integers
|
### Loading (Storing) Various Sizes of Integers
|
||||||
|
|
@ -125,6 +134,7 @@ available processor implements it. This includes Apple Silicon.
|
||||||
Notice the following:
|
Notice the following:
|
||||||
|
|
||||||
* Pointers and longs use `x` registers.
|
* Pointers and longs use `x` registers.
|
||||||
|
|
||||||
* All other integer sizes use `w` registers where the instruction itself
|
* All other integer sizes use `w` registers where the instruction itself
|
||||||
specifies the size.
|
specifies the size.
|
||||||
|
|
||||||
|
|
@ -133,15 +143,15 @@ Notice the following:
|
||||||
Consider this code to sum up the values in an array:
|
Consider this code to sum up the values in an array:
|
||||||
|
|
||||||
```c
|
```c
|
||||||
long Sum(long * values, long length) /* 1 */
|
long Sum(long * values, long length) /* 1 */
|
||||||
{ /* 2 */
|
{ /* 2 */
|
||||||
long sum = 0; /* 3 */
|
long sum = 0; /* 3 */
|
||||||
for (long i = 0; i < length; i++) /* 4 */
|
for (long i = 0; i < length; i++) /* 4 */
|
||||||
{ /* 5 */
|
{ /* 5 */
|
||||||
sum += values[i]; /* 6 */
|
sum += values[i]; /* 6 */
|
||||||
} /* 7 */
|
} /* 7 */
|
||||||
return sum; /* 8 */
|
return sum; /* 8 */
|
||||||
} /* 9 */
|
} /* 9 */
|
||||||
```
|
```
|
||||||
|
|
||||||
We're not going to translate this to assembly language. Instead, we will
|
We're not going to translate this to assembly language. Instead, we will
|
||||||
|
|
@ -154,16 +164,16 @@ fantastically inefficient (in this case).
|
||||||
Consider the following code that performs the same function:
|
Consider the following code that performs the same function:
|
||||||
|
|
||||||
```c
|
```c
|
||||||
long Sum(long * values, long length) /* 1 */
|
long Sum(long * values, long length) /* 1 */
|
||||||
{ /* 2 */
|
{ /* 2 */
|
||||||
long sum = 0; /* 3 */
|
long sum = 0; /* 3 */
|
||||||
long * end = values + length; /* 4 */
|
long * end = values + length; /* 4 */
|
||||||
while (values < end) /* 5 */
|
while (values < end) /* 5 */
|
||||||
{ /* 6 */
|
{ /* 6 */
|
||||||
sum += *(values++); /* 7 */
|
sum += *(values++); /* 7 */
|
||||||
} /* 8 */
|
} /* 8 */
|
||||||
return sum; /* 9 */
|
return sum; /* 9 */
|
||||||
} /* 10 */
|
} /* 10 */
|
||||||
```
|
```
|
||||||
|
|
||||||
Notice we don't use an index variable any longer. Instead, we use the
|
Notice we don't use an index variable any longer. Instead, we use the
|
||||||
|
|
@ -186,40 +196,42 @@ in both `C` and `C++`. It is *similar in spirit* to this in `C++`:
|
||||||
Here is a hand translation of the above `C` code for function `Sum()`:
|
Here is a hand translation of the above `C` code for function `Sum()`:
|
||||||
|
|
||||||
```asm
|
```asm
|
||||||
.global Sum // 1
|
.global Sum // 1
|
||||||
.text // 2
|
.text // 2
|
||||||
.align 4 // 3
|
.align 4 // 3
|
||||||
|
|
||||||
// x0 is the pointer to data // 5
|
// x0 is the pointer to data // 5
|
||||||
// x1 is the length and is reused as `end` // 6
|
// x1 is the length and is reused as `end` // 6
|
||||||
// x2 is the sum // 7
|
// x2 is the sum // 7
|
||||||
// x3 is the current dereferenced value // 8
|
// x3 is the current dereferenced value // 8
|
||||||
|
|
||||||
Sum: // 10
|
Sum: // 10
|
||||||
mov x2, xzr // 11
|
mov x2, xzr // 11
|
||||||
add x1, x0, x1, lsl 3 // 12
|
add x1, x0, x1, lsl 3 // 12
|
||||||
b 2f // 13
|
b 2f // 13
|
||||||
|
|
||||||
1: ldr x3, [x0], 8 // 15
|
1: ldr x3, [x0], 8 // 15
|
||||||
add x2, x2, x3 // 16
|
add x2, x2, x3 // 16
|
||||||
2: cmp x0, x1 // 17
|
2: cmp x0, x1 // 17
|
||||||
blt 1b // 18
|
blt 1b // 18
|
||||||
|
|
||||||
mov x0, x2 // 20
|
mov x0, x2 // 20
|
||||||
ret // 21
|
ret // 21
|
||||||
|
`
|
||||||
.end // 23
|
.end // 23
|
||||||
```
|
```
|
||||||
|
|
||||||
Recall that `Sum(long * values, long length)` means that `x0` has the
|
Recall that `Sum(long * values, long length)` means that `x0` has the
|
||||||
address of the first long in the array.
|
address of the first long in the array.
|
||||||
|
|
||||||
* We know it's an `x` register because it is an address.
|
* We know it's an `x` register because it is an address.
|
||||||
|
|
||||||
* We know it is the `0` register because it is the first argument.
|
* We know it is the `0` register because it is the first argument.
|
||||||
|
|
||||||
`x1` contains `length`.
|
`x1` contains `length`.
|
||||||
|
|
||||||
* We know it is an `x` register because it is a `long`.
|
* We know it is an `x` register because it is a `long`.
|
||||||
|
|
||||||
* We know it is the `1` register because it is the second argument.
|
* We know it is the `1` register because it is the second argument.
|
||||||
|
|
||||||
`Line 11` shows the first use of a "zero register," in this case `xzr`.
|
`Line 11` shows the first use of a "zero register," in this case `xzr`.
|
||||||
|
|
@ -253,14 +265,14 @@ code is written. We saw how this can save an instruction
|
||||||
(dereference) but also the *post increment* of the pointer.
|
(dereference) but also the *post increment* of the pointer.
|
||||||
|
|
||||||
```c
|
```c
|
||||||
sum += *(values++); /* 7 */
|
sum += *(values++); /* 7 */
|
||||||
```
|
```
|
||||||
|
|
||||||
is implemented by both `lines 15` and `16` in the assembly language.
|
is implemented by both `lines 15` and `16` in the assembly language.
|
||||||
|
|
||||||
```asm
|
```asm
|
||||||
1: ldr x3, [x0], 8 // 15
|
1: ldr x3, [x0], 8 // 15
|
||||||
add x2, x2, x3 // 16
|
add x2, x2, x3 // 16
|
||||||
```
|
```
|
||||||
|
|
||||||
`Line 17` compares the pointer to where we are now in the array to the
|
`Line 17` compares the pointer to where we are now in the array to the
|
||||||
|
|
@ -294,9 +306,9 @@ is a multiple of 8. This will become apparent when we discuss `structs`.
|
||||||
|
|
||||||
### Faster Memory Copy
|
### Faster Memory Copy
|
||||||
|
|
||||||
This is a heavily contrived example. In reality it is a fun challenge to
|
This is a *heavily* contrived example. In reality it is a fun challenge
|
||||||
write an optimal general purpose `memcpy` function. Or, you can just use
|
to write an optimal general purpose `memcpy` function. Or, you can just
|
||||||
`memcpy`.
|
use `memcpy`.
|
||||||
|
|
||||||
For the purposes of this discussion, ignore issues relating to
|
For the purposes of this discussion, ignore issues relating to
|
||||||
alignment.
|
alignment.
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue