improvements to text

This commit is contained in:
Perry Kivolowitz 2024-02-28 16:40:12 -06:00
parent 144acfb9d6
commit 776ebc0545
7 changed files with 57 additions and 48 deletions

View file

@ -172,7 +172,16 @@ and returns to us.
System calls are functions implemented inside the operating system. System calls are functions implemented inside the operating system.
To get there, at some point perhaps behind a wrapper function, a To get there, at some point perhaps behind a wrapper function found in
specific system call number is placed in `x8` with other scratch the CRT (C Run Time library), a distro specific system call number is
registers getting the system call's documented parameters and the `svc` placed in `x8` with other scratch registers getting the system call's
instruction is executed. documented parameters and the `svc` instruction is executed with
argument 0.
### Preference
We suggest using the CRT wrapper functions where possible because:
* They are easier to code
* They are portable between distributions of the OS

View file

@ -68,8 +68,8 @@ allows the function to `ret`urn.
Branch-with-link computes the address of the instruction following it. Branch-with-link computes the address of the instruction following it.
It places this address into register `x30` and then branches to the It places this address into register `x30` and then branches to the
label provided. It makes one link of breadcrumbs to follow to get back label provided. It makes one link of a trail of breadcrumbs to follow to
following a `ret`. get back following a `ret`.
**This is why it is absolutely essential to backup `x30` inside your **This is why it is absolutely essential to backup `x30` inside your
functions if they call other functions themselves.** functions if they call other functions themselves.**
@ -125,19 +125,17 @@ a `bl` instruction. At the moment `main()` entered, the address to
which it needed to return was sitting in `x30`. which it needed to return was sitting in `x30`.
Then, `main()` called a function - in this case `puts()` but which Then, `main()` called a function - in this case `puts()` but which
function doesn't matter - it called a function. In doing so, it function is called doesn't matter - it called a function. In doing so,
overwrote the address to which `main()` needed to return with the it overwrote the address to which `main()` needed to return with the
address of line 7 in the code. That is where `puts()` needs to address of line 7 in the code. That is where `puts()` needs to return.
return.
So, when line 7 executes it puts the contents of `x30` into the So, when line 7 executes it puts the contents of `x30` into the
program counter and branches to it. program counter and branches to it.
And the problem with this is? And the problem with this is?
Hint: notice where `gdb` put us after Hint: notice where `gdb` put us after the control-C. Still on line 7. An
the control-C. Still on line 7. An infinite loop of returning to the infinite loop of returning to the return statement.
return statement.
Here is a fixed version of the code: Here is a fixed version of the code:
@ -159,7 +157,7 @@ hw: .asciz "Hello World!" // 12
``` ```
The address to which `main()` should return is pushed onto the stack on The address to which `main()` should return is pushed onto the stack on
line 5. It should be safe there. line 5. It should be safe there, barring badly written code elsewhere.
It is recovered from the stack on line 8 and used by line 9's `ret`. It is recovered from the stack on line 8 and used by line 9's `ret`.
@ -182,11 +180,13 @@ that return no value.
What about functions that do return a value? What about functions that do return a value?
In the AARCH64 Linux style calling convention, values are returned in In the AARCH64 Linux style calling convention, values are returned in
`x0` and sometimes also returned in `x1` though this is uncommon. `x0` and sometimes also returned in other scratch registers though this
is uncommon. A function with more than one return value is not supported
by C or C++ but they can be written in assembly language where the rules
are yours to break.
Note that `x0` and `x1` could also be `w0` and `w1` or even the first Note that `x0` could also be `w0` or the first floating point register
and second floating point registers if the function is returning a if the function is returning a `float` or `double`.
`float` or `double`.
Here are samples, first in C / C++ then in the corresponding assembly Here are samples, first in C / C++ then in the corresponding assembly
language: language:

Binary file not shown.

View file

@ -2,14 +2,14 @@
How parameters are passed to functions can be different from OS to OS. How parameters are passed to functions can be different from OS to OS.
This chapter is written to the standard implemented for Linux. It This chapter is written to the standard implemented for Linux. It
differs from the **calling convention** used on, for example, the Mac in differs from the **calling convention** used on Apple Silicon where
that parameters are principally passed via the scratch registers. *variadic* functioned are used, for example.
Up to 8 parameters can be passed directly via registers. Each parameter Up to 8 parameters can be passed directly via scratch registers. Each
can be up to the size of an address, long or double (8 bytes). If you parameter can be up to the size of an address, long or double (8 bytes).
need to pass more than 8 parameters or you need to pass parameters which If you need to pass more than 8 parameters or you need to pass
are larger than 8 bytes or are `structs`, you would use a different parameters which are larger than 8 bytes or are `structs` called by
technique described later. value, you would use a different technique described later.
Remember that even large data structures that are passed by reference Remember that even large data structures that are passed by reference
are, in fact, passed via their base address (as a pointer). are, in fact, passed via their base address (as a pointer).
@ -20,7 +20,15 @@ For the purposes of the present discussion, we assume all parameters are
Up to 8 parameters are passed in the scratch registers (of which there Up to 8 parameters are passed in the scratch registers (of which there
are a matching 8). These are `x0` through `x7`. *Scratch* means the are a matching 8). These are `x0` through `x7`. *Scratch* means the
value of the register can be changed at will without any need to backup value of the register can be changed at will without any need to backup
or restore their values. or restore their values across function calls.
**This means that you cannot count on the contents of the scratch
registers maintaining their value if your function makes any function
calls.**
**This means that you cannot count on the contents of the scratch
registers maintaining their value if your function makes any function
calls.**
**This means that you cannot count on the contents of the scratch **This means that you cannot count on the contents of the scratch
registers maintaining their value if your function makes any function registers maintaining their value if your function makes any function
@ -130,7 +138,11 @@ func: ldr x2, [x0] // 1
ret // 5 ret // 5
``` ```
The `add` instruction cannot operate on values in memory. The value of `x0` on return is, in the general sense, undefined because
this is a `void` function.
The `add` instruction cannot operate on values in memory. Only upon
registers.
With little exception, all the *action* takes place in registers, not With little exception, all the *action* takes place in registers, not
memory. Therefore, the underlying values pointed to by the parameters memory. Therefore, the underlying values pointed to by the parameters
@ -200,33 +212,21 @@ how would the assembly language change?
Answer: just a little: Answer: just a little:
```asm ```asm
func: ldr x2, [x0] // 1 func: ldr x0, [x0] // 1
ldr x3, [x1] // 2 ldr x1, [x1] // 2
add x2, x2, x3 // 3 add x0, x0, x1 // 3
mov x0, x2 // 4 ret // 4
ret // 5
``` ```
Wait, why can we use x0 and x1 for the incoming address AND for holding
values? Because the memory location housing p1 and p2 are not disturbed
in this function but p1 was written back to in the previous example.
Passing by reference is also an instruction to the compiler to treat Passing by reference is also an instruction to the compiler to treat
pointers a little differently - the differences don't show up here so pointers a little differently - the differences don't show up here so
there the only change to our pointer passing version is how we return there the only change to our pointer passing version is how we return
the answer. the answer.
But wait...
There is a small optimization we can make here:
```asm
func: ldr x0, [x0] // 1
ldr x1, [x1] // 2
add x0, x0, x1 // 3
ret // 4
```
This time we're not storing anything back to `p1` or `p2` so we can
reuse `x0` and `x1` since the addresses they contained aren't needed
again. Smart human!
## What If We Need More Than Eight Parameters? ## What If We Need More Than Eight Parameters?
First, do you **really** need to pass more than 8 parameters? First, do you **really** need to pass more than 8 parameters?

Binary file not shown.

View file

@ -14,7 +14,7 @@
main: main:
stp x29, x30, [sp, -16]! stp x29, x30, [sp, -16]!
mov x0, 3510 mov w0, 256
mov x8, 93 mov x8, 93
svc 0 svc 0
ldp x29, x30, [sp], 16 ldp x29, x30, [sp], 16

Binary file not shown.