introducing the PFPU structue which allows the machine specific
code some flexibility on how to handle the FPU process state.
for example, in the pc and pc64 kernel, the FPsave structure is
arround 512 bytes. with avx512, it could grow up to 2K. instead
of embedding that into the Proc strucutre, it is more effective
to allocate it on first use of the fpu, as most processes do not
use simd or floating point in the first place. also, the FPsave
structure has special 16 byte alignment constraint, which further
favours dynamic allocation.
this gets rid of the memmoves in pc/pc64 kernels for the aligment.
there is also devproc, which is now checking if the fpsave area
is actually valid before reading it, avoiding debuggers to see
garbage data.
the Notsave structure is gone now, as it was not used on any
machine.
qemu puts multiboot data after the end of the kernel image, so
to be able to KADDR() that memory early, we extend the initial
identity mapping by 16K. right now we just got lucky with
the pc kernel as it rounds the map to 4MB pages.
the FPOFF macro that follows the FXSAVE/FSAVE instructions in l.s
used to execute WAIT instruction when the TS flag was not set. this
is wrong and causes pending exceptions to be raised from fpsave which
is called from provsave() which holds up->rlock making it deadlock
when matherror() tries to postnote() to itself.
so making FPOFF non-waiting (just set TS flag).
we handle pending exception when restoring the context.
as with the Block refcount changes, _xinc() and _xdec() arent
used anymore, so remove them.
architecure can still define ainc()/adec() when it needs them.
forkret() labels the instructions that can raise exceptions
so they can be handled in trap(). this can happen when
segment descriptors get invalidated.
we now always use the new FXSAVE format in FPsave structure and fpregs
file, converting back and forth in fpx87save() and fpx87restore().
document that fprestore() is a destructive operation now.
change fp register definition in libmach and adapt fpr() acid funciton.
avoid unneccesary copy of fpstate and fpsave in sysfork(). functions
including syscalls do not preserve the fp registers and copying fpstate
from the current process would mean we had to fpsave(&up->fpsave); first.
simply not doing it, new process starts in FPinit state.