corrected layout of floats for AARCH64

This commit is contained in:
Perry Kivolowitz 2024-02-15 13:14:59 -06:00
parent d099c79c1d
commit a7e89718b1
4 changed files with 7 additions and 9 deletions

View file

@ -16,7 +16,7 @@ be on one chip and RAM on another set of chips.
The idea of registers were introduced a very long time ago as being
super fast storage that is implemented directly in the CPU. Because they
are within the CPU, distance isn't really an issue. Similarly, because
are within the CPU, distance isn'tv really an issue. Similarly, because
they are in the CPU, they operate as the speed of the CPU itself.
Registers don't have addresses because they are not in memory. Instead
@ -232,7 +232,7 @@ This is like:
*ptr = var;
```
The analogies are not exact but close.
**The analogies are not exact but close.**
Pairs of registers can also be stored and loaded with the `stp` and
`ldp` op codes.

Binary file not shown.

After

Width:  |  Height:  |  Size: 214 KiB

View file

@ -29,15 +29,13 @@ For example, in the following image, note the overlap of two single
precision floats within a single double precision floating point
register.
*NOTE NOTE NOTE* This must be fixed - the picture corresponds to the
32 bit state - AARCH32!
*NOTE NOTE NOTE* To keep to our promise of simplicity for now, consider
only `B0`, `H0`, `S0` and `D0`. The remainder of the image ([from The
Eclectic Light Company](https://eclecticlight.co/2021/08/23/code-in-arm-assembly-lanes-and-loads-in-neon/)) deals with SIMD, covered
later.
![regs](./regs.png)
![regs](./simdlanes.jpg)
It is worth noting early and often that you should not mix dealing
with different precisions assuming that because of the overlaps in
space, you'll get a meaningful result.
The above image does not show the corresponding layout of [half
precision](./half.md) floating point registers. `H0` sits in the least
significant bits of `S0` and so on.

Binary file not shown.