improvements to text

This commit is contained in:
Perry Kivolowitz 2024-02-28 16:40:12 -06:00
parent 144acfb9d6
commit 776ebc0545
7 changed files with 57 additions and 48 deletions

View file

@ -172,7 +172,16 @@ and returns to us.
System calls are functions implemented inside the operating system.
To get there, at some point perhaps behind a wrapper function, a
specific system call number is placed in `x8` with other scratch
registers getting the system call's documented parameters and the `svc`
instruction is executed.
To get there, at some point perhaps behind a wrapper function found in
the CRT (C Run Time library), a distro specific system call number is
placed in `x8` with other scratch registers getting the system call's
documented parameters and the `svc` instruction is executed with
argument 0.
### Preference
We suggest using the CRT wrapper functions where possible because:
* They are easier to code
* They are portable between distributions of the OS

View file

@ -68,8 +68,8 @@ allows the function to `ret`urn.
Branch-with-link computes the address of the instruction following it.
It places this address into register `x30` and then branches to the
label provided. It makes one link of breadcrumbs to follow to get back
following a `ret`.
label provided. It makes one link of a trail of breadcrumbs to follow to
get back following a `ret`.
**This is why it is absolutely essential to backup `x30` inside your
functions if they call other functions themselves.**
@ -125,19 +125,17 @@ a `bl` instruction. At the moment `main()` entered, the address to
which it needed to return was sitting in `x30`.
Then, `main()` called a function - in this case `puts()` but which
function doesn't matter - it called a function. In doing so, it
overwrote the address to which `main()` needed to return with the
address of line 7 in the code. That is where `puts()` needs to
return.
function is called doesn't matter - it called a function. In doing so,
it overwrote the address to which `main()` needed to return with the
address of line 7 in the code. That is where `puts()` needs to return.
So, when line 7 executes it puts the contents of `x30` into the
program counter and branches to it.
And the problem with this is?
Hint: notice where `gdb` put us after
the control-C. Still on line 7. An infinite loop of returning to the
return statement.
Hint: notice where `gdb` put us after the control-C. Still on line 7. An
infinite loop of returning to the return statement.
Here is a fixed version of the code:
@ -159,7 +157,7 @@ hw: .asciz "Hello World!" // 12
```
The address to which `main()` should return is pushed onto the stack on
line 5. It should be safe there.
line 5. It should be safe there, barring badly written code elsewhere.
It is recovered from the stack on line 8 and used by line 9's `ret`.
@ -182,11 +180,13 @@ that return no value.
What about functions that do return a value?
In the AARCH64 Linux style calling convention, values are returned in
`x0` and sometimes also returned in `x1` though this is uncommon.
`x0` and sometimes also returned in other scratch registers though this
is uncommon. A function with more than one return value is not supported
by C or C++ but they can be written in assembly language where the rules
are yours to break.
Note that `x0` and `x1` could also be `w0` and `w1` or even the first
and second floating point registers if the function is returning a
`float` or `double`.
Note that `x0` could also be `w0` or the first floating point register
if the function is returning a `float` or `double`.
Here are samples, first in C / C++ then in the corresponding assembly
language:

Binary file not shown.

View file

@ -2,14 +2,14 @@
How parameters are passed to functions can be different from OS to OS.
This chapter is written to the standard implemented for Linux. It
differs from the **calling convention** used on, for example, the Mac in
that parameters are principally passed via the scratch registers.
differs from the **calling convention** used on Apple Silicon where
*variadic* functioned are used, for example.
Up to 8 parameters can be passed directly via registers. Each parameter
can be up to the size of an address, long or double (8 bytes). If you
need to pass more than 8 parameters or you need to pass parameters which
are larger than 8 bytes or are `structs`, you would use a different
technique described later.
Up to 8 parameters can be passed directly via scratch registers. Each
parameter can be up to the size of an address, long or double (8 bytes).
If you need to pass more than 8 parameters or you need to pass
parameters which are larger than 8 bytes or are `structs` called by
value, you would use a different technique described later.
Remember that even large data structures that are passed by reference
are, in fact, passed via their base address (as a pointer).
@ -20,7 +20,15 @@ For the purposes of the present discussion, we assume all parameters are
Up to 8 parameters are passed in the scratch registers (of which there
are a matching 8). These are `x0` through `x7`. *Scratch* means the
value of the register can be changed at will without any need to backup
or restore their values.
or restore their values across function calls.
**This means that you cannot count on the contents of the scratch
registers maintaining their value if your function makes any function
calls.**
**This means that you cannot count on the contents of the scratch
registers maintaining their value if your function makes any function
calls.**
**This means that you cannot count on the contents of the scratch
registers maintaining their value if your function makes any function
@ -130,7 +138,11 @@ func: ldr x2, [x0] // 1
ret // 5
```
The `add` instruction cannot operate on values in memory.
The value of `x0` on return is, in the general sense, undefined because
this is a `void` function.
The `add` instruction cannot operate on values in memory. Only upon
registers.
With little exception, all the *action* takes place in registers, not
memory. Therefore, the underlying values pointed to by the parameters
@ -200,33 +212,21 @@ how would the assembly language change?
Answer: just a little:
```asm
func: ldr x2, [x0] // 1
ldr x3, [x1] // 2
add x2, x2, x3 // 3
mov x0, x2 // 4
ret // 5
func: ldr x0, [x0] // 1
ldr x1, [x1] // 2
add x0, x0, x1 // 3
ret // 4
```
Wait, why can we use x0 and x1 for the incoming address AND for holding
values? Because the memory location housing p1 and p2 are not disturbed
in this function but p1 was written back to in the previous example.
Passing by reference is also an instruction to the compiler to treat
pointers a little differently - the differences don't show up here so
there the only change to our pointer passing version is how we return
the answer.
But wait...
There is a small optimization we can make here:
```asm
func: ldr x0, [x0] // 1
ldr x1, [x1] // 2
add x0, x0, x1 // 3
ret // 4
```
This time we're not storing anything back to `p1` or `p2` so we can
reuse `x0` and `x1` since the addresses they contained aren't needed
again. Smart human!
## What If We Need More Than Eight Parameters?
First, do you **really** need to pass more than 8 parameters?

Binary file not shown.

View file

@ -14,7 +14,7 @@
main:
stp x29, x30, [sp, -16]!
mov x0, 3510
mov w0, 256
mov x8, 93
svc 0
ldp x29, x30, [sp], 16

Binary file not shown.