- remove arbitrary limits on screen size, just check with badrect()
- post resize when physgscreenr is changed (actualsize ctl command)
- preserve physgscreenr across softscreen flag toggle
- honor panning flag on resize
- fix nil dereference in panning ctl command when scr->gscreen == nil
- use clipr when drawing vga plan 9 console (vgascreenwin())
check for "needkey " error string from auth_userpasswd() in case no
key is pesent in factotum. this used to be a common trap with stand
alone machines that do not have an authentication server setup.
indicate authentication in progress by drawing a white border.
delete unneccesary cruft and simplify the code.
the role=login protocol is ment to replace proto=p9cr in
auth_userpasswd() from libauth to authenticate a user
given a username and a password. in contrast to p9cr, it
does not require an authentication server when user is the
hostowner and its key is present in factotum.
- unroll the loops
- rotate the taps on each step, avoiding copies
- simplify boolean formulas for Ch() and Maj()
this yields arround 40% throughput increase on 32/64bit
archs for sha2_256 and sha2_512 on amd64.
- get rid of the temporary copies and memmoves()
- when the data pointer is aligned, do xor and copying inline
speedup for auth/aescbc encryption depends on arch:
- zynq 7% (arm)
- t23 13% (386)
- x230 20% (amd64, aes-ni)
- apu2 25% (amd64, aes-ni)
to capture bios and bootloader messages, convert the contents
on the screen to kmesg.
on machines without legacy cga, the cga registers read out as
0xFF, resuting in out of bounds cgapos. so set cgapos to 0 in
that case.
from Ori_B:
There were a small number of changes needed from the tarball
on spinroot.org:
- The mkfile needed to be updated
- Memory.h needed to not be included
- It needed to invoke /bin/cpp instead of gcc -E
- It depended on `yychar`, which our yacc doesn't
provide.
I'm still figuring out how to use spin, but it seems to do
the right thing when testing a few of the examples:
% cd $home/src/Spin/Examples/
% spin -a peterson.pml
% pcc pan.c -D_POSIX_SOURCE
% ./6.out
(Spin Version 6.4.7 -- 19 August 2017)
+ Partial Order Reduction
Full statespace search for:
never claim - (none specified)
assertion violations +
acceptance cycles - (not selected)
invalid end states +
State-vector 32 byte, depth reached 24, errors: 0
40 states, stored
27 states, matched
67 transitions (= stored+matched)
0 atomic steps
hash conflicts: 0 (resolved)
Stats on memory usage (in Megabytes):
0.002 equivalent memory usage for states (stored*(State-vector + overhead))
0.292 actual memory usage for states
128.000 memory used for hash table (-w24)
0.534 memory used for DFS stack (-m10000)
128.730 total actual memory usage
unreached in proctype user
/tmp/Spin/Examples/peterson.pml:20, state 10, "-end-"
(1 of 10 states)
pan: elapsed time 1.25 seconds
pan: rate 32 states/second
doing 4 quarterround's in parallel using 128-bit
vector registers. for second round shuffle the columns and
then shuffle back.
code is rather obvious. only trick here is for the first
quaterround PSHUFLW/PSHUFHW is used to swap the halfwords
for the <<<16 rotation.
instead of hardcoding the tunnel interface MTU to 1280,
we calculate the tunnel MTU from the outside MTU, which
can now be specified with the -m mtu option. The deault
outside MTU is 1500 - 8 (PPPoE).
when a process does an exec, it calls procsetup() which
unconditionally sets the sets the TS flag and fpstate=FPinit
and fpurestore() should not revert the fpstate.
cannot just reenable the fpu in FPactive case as we might have
been procsaved() an rescheduled on another cpu. what was i thinking...
thanks qu7uux for reproducing the problem.
new approach to graphics memory management:
the kernel driver never really cared about the size of stolen memory
directly. that was only to figure out the maximum allocation
to place the hardware cursor image somewhere at the end of the
allocation done by bios.
qu7uux's gm965 bios however wont steal enougth memory for his
native resolution so we have todo it manually.
the userspace igfx driver will figure out how much the bios
allocated by looking at the gtt only. then extend the memory by
creating a "fixed" physical segment.
the kernel driver allocates the memory for the cursor image
from normal kernel memory, and just maps it into the gtt at the
end of the virtual kernel framebuffer aperture.
thanks to qu7uux for the patch.
Add assembler versions for aes_encrypt/aes_decrypt and the key
setup using AES-NI instruction set. This makes aes_encrypt and
aes_decrypt into function pointers which get initialized by
the first call to setupAESstate().
Note that the expanded round key words are *NOT* stored in big
endian order as with the portable implementation. For that reason
the AESstate.ekey and AESstate.dkey fields have been changed to
void* forcing an error when someone is accessing the roundkey
words. One offender was aesXCBmac, which doesnt appear to be
used and the code looks horrible so it has been deleted.
The AES-NI implementation is for amd64 only as it requires the
kernel to save/restore the FPU state across syscalls and
pagefaults.
The aim is to take advantage of SSE instructions such as AES-NI
in the kernel by lazily saving and restoring FPU state across
system calls and pagefaults. (everything can can do I/O)
This is accomplished by the functions fpusave() and fpurestore().
fpusave() remembers the current state and disables the FPU if it
was active by setting the TS flag. In case the FPU gets used,
the current state gets saved and a new PFPU.fpslot is allocated
by mathemu().
fpurestore() restores the previous FPU state, reenabling the FPU
if fpusave() disabled it.
In the most common case, when userspace is not using the FPU,
then fpusave()/fpurestore() just toggle the FPpush bit in
up->fpstate.
When the FPU was active, but we do not use the FPU, then nothing
needs to be saved or restored. We just switched the TS flag on
and off agaian.
Note, this is done for the amd64 kernel only.
introducing the PFPU structue which allows the machine specific
code some flexibility on how to handle the FPU process state.
for example, in the pc and pc64 kernel, the FPsave structure is
arround 512 bytes. with avx512, it could grow up to 2K. instead
of embedding that into the Proc strucutre, it is more effective
to allocate it on first use of the fpu, as most processes do not
use simd or floating point in the first place. also, the FPsave
structure has special 16 byte alignment constraint, which further
favours dynamic allocation.
this gets rid of the memmoves in pc/pc64 kernels for the aligment.
there is also devproc, which is now checking if the fpsave area
is actually valid before reading it, avoiding debuggers to see
garbage data.
the Notsave structure is gone now, as it was not used on any
machine.
adjust to new aes_xts routines.
allow optional offset in the 4th argument where the encrypted
sectors start instead of hardcoding the 64K header area for
cryptsetup.
avoid allocating temporary buffer for cryptio() reads, we can
just decrypt in place there.
use sdmalloc() to allocate the temporary buffer for cryptio()
writes so that devsd wont need to allocate and copy in case
it didnt like our alignment.
do not duplicate the error reporting code, just use io()
that is what it is for.
allow 2*256 bit keys in addition to 2*128 bit keys.
the previous implementation was not portable at all, assuming
little endian in gf_mulx() and that one can cast unaligned
pointers to ulong in xor128(). also the error code is likely
to be ignored, so better abort() when the length is not a
multiple of the AES block size.
we also pass in full AESstate structures now instead of
the expanded key longs, so that we do not need to hardcode
the number of rounds. this allows each indiviaul keys to
be bigger than 128 bit.