floating point / apple silicon improvements

This commit is contained in:
Perry Kivolowitz 2023-01-18 17:43:26 -06:00
parent 6f7f27aaa6
commit 4f3d2de398
4 changed files with 100 additions and 217 deletions

View file

@ -10,14 +10,12 @@ MAIN
mov x29, sp mov x29, sp
LLD_ADDR x0, fmt LLD_ADDR x0, fmt
LLD_FLT x1, s0, flt LLD_FLT x1, s0, flt
#if defined(__APPLE__)
fcvt d0, s0 fcvt d0, s0
fmov x1, d0 #if defined(__APPLE__)
PUSH_R x1 PUSH_R d0
CRT printf CRT printf
add sp, sp, 16 add sp, sp, 16
#else #else
fcvt d0, s0
CRT printf CRT printf
#endif #endif
POP_P x29, x30 POP_P x29, x30

View file

@ -2,17 +2,17 @@
This book is written to the Linux calling convention as stated early on. This book is written to the Linux calling convention as stated early on.
Unfortunately, this means that even if you own an Apple Silicon machine, Unfortunately, this means that even if you own an Apple Silicon machine,
which is AARCH64, you'd still need a Linux virtual machine. This didn't which is AARCH64, you'd still need a Linux virtual machine.
sit well with some on reddit and rightfully so. We undertook to
develop a way of writing assembly code once and having it work on both
Mac OS and Linux to the degree possible.
We are pleased to present this chapter along with a set of assembly This didn't sit well with some on reddit and rightfully so. We undertook
language macros that, if used, help a great deal. to develop a way of writing assembly code once and having it work on
both Mac OS and Linux to the degree possible.
[Much of this chapter has been replaced here](./../../macros/).
There are some things we cannot adapt, such as variadic functions (e.g. There are some things we cannot adapt, such as variadic functions (e.g.
`printf()`) but we explain how code can be written to be compatible with `printf()`) but we explain how code can be written to be compatible with
both environments at the expense of some duplicated code. both environments at the expense of some minor amount duplicated code.
## Assembly language macros ## Assembly language macros
@ -44,95 +44,10 @@ This gets expanded to:
add x0, x0, fmt@PAGEOFF add x0, x0, fmt@PAGEOFF
``` ```
## Loading the address of data ## Reminder - there is documentation for these macros
Assuming: The documentation for the macro suite has been moved
[here](./../../macros/).
```text
.data
fmt: .asciz "Hello!"
```
When we:
`ldr x0, =fmt`
we are hoping to put the address of the label `fmt` into `x0`. But how
would this be possible since we've seen that addresses are (often) six
bytes long and our instructions are always 4 bytes long? As we describe
elsewhere, the above `ldr` instance is actually turned into instructions
to load an address relative to the address of the current instruction.
As long as the data we want is relatively close to the `ldr`, this works
out to a difference in addresses that is small (and so, can be fit into
a 4 byte instruction).
Apple does not allow instructions of the form:
`ldr x0, =fmt`
Instead they take a more general approach of splitting addresses of data
into two parts:
1. The *page* on which the label lives - think of this as generating the
upper bits of the address.
2. The *offset* on the page where the label actually resides - think of
this as the lower bits of the address.
Hence:
```text
adrp x0, fmt@PAGE
add x0, x0, fmt@PAGEOFF
```
The first instruction puts the high bits of the label's address in `x0`.
Then, the second instruction literally adds the low bits of the label's
address into `x0` forming a complete address.
In this way, labels can be further away from the current instruction
than the Linux way.
Apple does something similar with global variables, perhaps defined in
C or C++ files. Instead of `PAGE` and `PAGEOFF` they use global
versions. The macro `GLD_ADDR` is used in this case rather than
`LLD_ADDR` which works with "locally" defined addresses.
## How does this help bridge Apple and Linux?
[Here](./apple-linux-convergence.S) is an assembly language file
containing the macros we're developing to bring Linux and Apple Silicon
assembly language closer together.
Notice it has:
```text
.macro LLD_ADDR xreg, label
adrp \xreg, \label@PAGE
add \xreg, \xreg, \label@PAGEOFF
.endm
```
but also:
```text
.macro LLD_ADDR xreg, label
ldr \xreg, =\label
.endm
```
Which of these are used is determined by whether or not you are
assembling on an Apple machine or a Linux machine using features
provided by the standard C pre-processor. I.e.:
```text
# if defined(__APPLE__)
// apple stuff
# else
// not apple stuff
# endif
```
## How to force the C pre-processor to run on assembly language ## How to force the C pre-processor to run on assembly language
@ -147,52 +62,13 @@ file ends in .S*
## Differences between Apple and Linux ## Differences between Apple and Linux
### Loading label addresses
This was described above. If you use `LLD_ADDR` the macros will adapt
for you.
### Function labels
Apple prepends an underscore, Linux does not. Instead of:
`bl printf`
do:
`CRT printf`
and the macro will adapt.
### main
Like other function labels, Apple wants `_main` while Linux wants
`main`.
Simply use:
`MAIN`
and the macro will adapt.
### Globals
Instead of writing:
`.global main`
use
`GLABEL main`
and the macros will adapt.
You can find documentation on the macros [here](../../macros/README.md).
## Variadic functions ## Variadic functions
Functions like `printf()` are variadic. This means the function can take *This is important! Understand this section in order to be able to use
any number of parameters. The first argument contains some information `printf()`.*
Functions like `printf()` are variadic. These are functions that can
take any number of parameters. The first argument contains information
that tells the function how many parameters were actually given. that tells the function how many parameters were actually given.
For example: For example:
@ -205,36 +81,35 @@ expected.
Apple and Linux handle variadic differently. Apple and Linux handle variadic differently.
Linux will use the scratch registers first up to `x7`. *Then* it will Linux will use the scratch registers first up to the integer or floating
use the stack. point register 7. *Then* it will use the stack.
Apple will put the first parameter in the zero register and then shifts Apple will put the first parameter in the zero register and then shifts
immediately to putting all other parameters onto the stack. immediately to putting all other parameters onto the stack.
We overcome this difference by detecting which environment we are We overcome this difference by detecting which environment we are
building in using `#if` after having first set up for the Linux version. building in using `#if` after having first set up for the Linux version.
By setting up for the Linux version, the Apple version involves just By setting up for the Linux version, the Apple version involves just
pushing registers onto the stack. pushing registers onto the stack.
Remember that to print a float or double, they must be copied to `x` Remember that `%f` **always** expects a double. This is hidden from you
registers. in C and C++ but is important in assembly language. Use `fcvt` to shift
from single precision to double.
An example: An example:
```text ```text
LLD_ADDR x0, fmt // loads the address of fmt LLD_ADDR x0, fmt
LLD_PTR x1, ptr // loads **ptr LLD_FLT x1, s0, flt
ldr x1, [x1] // turns **ptr into *ptr fcvt d0, s0
ldr x2, [x1] // dereferences *ptr to get value #if defined(__APPLE__)
# if defined(__APPLE__) PUSH_R d0
// if apple, push the second and third argument to stack CRT printf
PUSH_P x1, x2 add sp, sp, 16
CRT printf #else
add sp, sp, 16 CRT printf
# else #endif
// if not apple, the registers are already set up
CRT printf
# endif
``` ```
## Other differences ## Other differences

Binary file not shown.

View file

@ -1,77 +1,87 @@
// Macros to permit the "same" assembly language to build on ARM64 /* Macros to permit the "same" assembly language to build on ARM64
// Linux systems as well as Apple Silicon systems. Linux systems as well as Apple Silicon systems.
//
// Perry Kivolowitz
// A Gentle Introduction to Assembly Language
See the fuller documentation at:
https://github.com/pkivolowitz/asm_book/blob/main/macros/README.md
Perry Kivolowitz
A Gentle Introduction to Assembly Language
*/
.macro GLD_PTR xreg, label
#if defined(__APPLE__) #if defined(__APPLE__)
// Apple makes a distinction between loading something close by
// versus something global. Note the use of GOTPAGE rather then
// PAGE.
//
// Note: this macro dereferences the label getting what is at
// the label's address.
.macro GLD_PTR xreg, label // Dereference a global *
adrp \xreg, _\label@GOTPAGE adrp \xreg, _\label@GOTPAGE
ldr \xreg, [\xreg, _\label@GOTPAGEOFF] ldr \xreg, [\xreg, _\label@GOTPAGEOFF]
#else
ldr \xreg, =\label
ldr \xreg, [\xreg]
#endif
.endm .endm
.macro GLD_ADDR xreg, label // Get a global address .macro GLD_ADDR xreg, label // Get a global address
#if defined(__APPLE__)
adrp \xreg, _\label@GOTPAGE adrp \xreg, _\label@GOTPAGE
add \xreg, \xreg, _\label@GOTPAGEOFF add \xreg, \xreg, _\label@GOTPAGEOFF
.endm #else
// This macro loads the address of a nearby label.
.macro LLD_ADDR xreg, label // Load a local address
adrp \xreg, \label@PAGE
add \xreg, \xreg, \label@PAGEOFF
.endm
.macro GLABEL label
.global _\label
.endm
.macro MAIN
_main:
.endm
.macro CRT label
bl _\label
.endm
#else // LINUX
.macro GLABEL label
.global \label
.endm
.macro MAIN
main:
.endm
.macro CRT label
bl \label
.endm
// This macro treats label as a pointer and dereferences it.
// That is, it puts into the xreg what is found at the address
// of the label.
.macro GLD_PTR xreg, label // Dereference a global *
ldr \xreg, =\label ldr \xreg, =\label
ldr \xreg, [\xreg] #endif
.endm .endm
// This macro loads the address of a nearby label.
.macro LLD_ADDR xreg, label .macro LLD_ADDR xreg, label
#if defined(__APPLE__)
adrp \xreg, \label@PAGE
add \xreg, \xreg, \label@PAGEOFF
#else
ldr \xreg, =\label ldr \xreg, =\label
#endif
.endm .endm
.macro LLD_DBL xreg, dreg, label
#if defined(__APPLE__)
adrp \xreg, \label@PAGE
add \xreg, \xreg, \label@PAGEOFF
ldur \dreg, [\xreg]
// fmov \dreg, \xreg
#else
ldr \xreg, =\label
ldur \dreg, [\xreg]
#endif #endif
.endm
.macro LLD_FLT xreg, sreg, label
#if defined(__APPLE__)
adrp \xreg, \label@PAGE
add \xreg, \xreg, \label@PAGEOFF
ldur \sreg, [\xreg]
#else
ldr \xreg, =\label
ldur \sreg, [\xreg]
#endif
.endm
.macro GLABEL label
#if defined(__APPLE__)
.global _\label
#else
.global \label
#endif
.endm
.macro MAIN
#if defined(__APPLE__)
_main:
#else
main:
#endif
.endm
.macro CRT label
#if defined(__APPLE__)
bl _\label
#else
bl \label
#endif
.endm
.macro START_PROC // after starting label .macro START_PROC // after starting label
.cfi_startproc .cfi_startproc