alexchandel got the kernel to crash with divide error
on qemu 2.1.2/macosx at this location. probably
caused by perfticks()/tsc being wrong or accounttime()
not having been called yet from timer interrupt yet for
some reason.
the syscall stubs (for amd64) currently have a unconditional
spill of the first (register) argument to the stack.
sysr1 (and _nsec) are exceptional in that they do not
take any arguments, so the stub is writing unconditionally
to ther first argument slot on the stack.
i could avoid emiting the spill in the syscall stubs for
sysr1 but that would also break truss which assumes fixed
instruction sequence from stub start to the syscall number.
i'm not going to complicate the syscall stubs just for
sysr1 (_nsec is not used in 9front), but just add a dummy
argument to sysr1 definition that can receive the bogus
argument spill.
devip can only handle Maskconv+1 conversations per
protocol depending on how many bits it uses in the
qid to encode the conversation number.
we check this when the protocol gets registered.
if we do not do this, the kernel will mysteriously
panic when the conversaion numbers collide which
took some time to debug.
the numbers from /dev/sysstat overflow on 32bit, so have
to do subtraction modulo 2^32 as we calculate with 64bit
integers.
thanks mischief for reporting this.
WHAT WHERE THEY *THINKING*??!?!
unlike seek, the (new) nsec syscall (not used in 9front libc)
returns the time value in register (from nix), so do the same
for compatibility.
memfillcolor() used to write longs in host byte
order which is wrong. have to always use little
endian.
to simplify, moved little endian conversion into
memsetl() and memsets() avoiding code duplication.
file->parent can be nil when the file has been previously removed.
removefile() deals with this, so skip the permission check in
that case and let removefile() error out.
some of us run auth servers from home that are used by multiple
servers on the internet. when the home authserver becomes unreachable,
services on the outside servers stop working. so we thought about
specifing a secondary auth servers for backup when the primary
server is not reachable.
this changes authdial() to consult multiple auth= entries in
the authdom= or dom= tuples, trying each one in order until
dial() succeeds.
from segattach(2):
Va and len specify the position of the segment in the
process's address space. Va is rounded down to the nearest
page boundary and va+len is rounded up. The system does not
permit segments to overlap. If va is zero, the system will
choose a suitable address.
just rounding up len isnt enougth. we have to round up va+len
instead of just len so that the span [va, va+len) is covered
even if va is not page aligned.
kenjis example:
print("%p\n",ap); // 206cb0
ap = segattach(0, "shared", ap, 1024);
print("%p\n",ap); // 206000
term% cat /proc/612768/segment
Stack defff000 dffff000 1
Text R 1000 6000 1
Data 6000 7000 1
Bss 7000 7000 1
Shared 206000 207000 1
term%
note that 0x206cb0 + 0x400 > 0x20700.
RFC2104 defines HMAC for keys bigger than the 64 byte block
size as follows:
Applications that use keys longer than B (64) bytes will
first hash the key using H (the hash function) and then
use the resultant L byte string as the actual key to HMAC.
this is a work in progress implementation of the ayiya (anything
in anything) protocol as used by sixxs.net. hiro tested it and it
worked for him, but progress has stalled as sixxs.net rejected my
request for an account and ignored my emails since.
after running ip/ipconfig -6, we are unable to ping our
own link-local address and the arp daemon sends out useless
neighbor solicitation requests to itself. this change
adds an arp entry for our ipv6 address. however, this
must not be done for tentative interface configuration.
allocate the Iplifc structure on the stack instead.
i assuming that it was allocated on heap in fear of
causing stack oveflow. on 386, this adds arround
88 bytes on the stack but it doesnt seem to cause
any trouble. (checked with poolcheck after ctl write)
we have to recheck the condition under tod lock, otherwise
another process can come in and updated tod.last and
tod.off and once we have the lock, we would make time
jump backwards.
unify the keyboard and mouse readers into one using the hid
report parser for both. remove the keyboard protocol handling,
as it is now handled by hid parser and all we get is a sequence
of keycodes in Hiddev.k[] which we diff for up/down and translate
to pc scancodes.
in backwards mode, the roles of the aan filters need to be
reversed. add "-n address" option to import to override the
announce address for the aan server part (default tcp!*!0).
mischief got babble error with his mobile phone as we used to
read at max 64 bytes for the data response phase. his device
has 512 byte packet size.
thans to mischief for the patience.
ipifcunbind() could error out from ipifcremlifc() and Medium.unbind()
*after* decrementing ifc->conv->inuse! move the decrement after
calling these functions.
make ipifcremlifc() never raise error but return error string.
the only places where it could error is when it calls into
medium functions like Medium.remroute() and Medium.remmulti().
Ignore these errors as they could happen when the ethernet driver
crashed (think imported ethernet device or usb ethernet
in userspace), so we will be able to unbind.
add waserror() handlers as neccesary to deal with errors from
Medium.addmulti(), Medium.areg() and arpenter() to properly
unlock the data structures.
the allow command now takes an optional uid argument for the user
to be granted temporary god status on the fileserver for maintenance.
this was kenji okomotos idea, so thanks :)
remove wstatallow and writeallow flags. instead, we have global:
int allowed;
that contains the uid of the currently allowed user id or -1
if permission checking is globally disabled for the fileserver.
when zero, normal permission checking takes place.
added int isallowed(File*) function that returns non-zero when the
context is the console, or the allowed user. this is also used internally
by iaccess(), so all the extra code of in the callers of iaccess()
is gone now.
dont conflate allowed user with noauth flag and auto-allow on ream.
the installer already knows about noauth and allow flags so theres no
problem with bootstraping.
when mountmux() completes a request for another process, enforce odering
of the loads and stores to the request prior to writing q->done = 1
so mntflushfree() sees q->done != 0 only when the request has actually
completed. otherwise, the q->done = 1 store could have been reordered
before the load from q->z, reading from already freed request and causing
spurious wakeups.
removing unused mntstats callback.
use nil for pointers instead of 0.
_sl reported crash:
stats 593: suicide: sys: trap: fault write addr=0xffffffff8258d1b0 pc=0x204cc7
; acid 593
/proc/593/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: lstk()
notejmp(ret=0x1,j=0x40ac90)+0x13 /sys/src/libc/amd64/notejmp.c:10
alarmed(a=0xffffffff8258d1b0,s=0x7ffffeffea58)+0x3f /sys/src/cmd/stats.c:718
notifier+0x3e /sys/src/libc/port/atnotify.c:15
acid:
note how a in alarmed is a kernel address!
the first Ureg* argument is passed to the note handler in the
RARG (BX) register, which was not loaded when returning to
userspace from syscall() thru forkret(). fix by returning thru
noteret() from syscall().
old iostats failed to work when builidng the kernel due to old bugs
that where already fixed in exportfs. instead of backporting the fixes,
reimplement iostats as a filter that sits between exportfs and the
process mount. from users perspective, theres no difference.
the result is much smaller and can handle everything that exportfs
can like /srv.
Xqdol() used to take quadratic time because of strcat(),
the code isnt really needed as list2str() aready does the
same thing in linear time without the strcat().
add estrdup() which uses emalloc() so allocation error are
catched.
move strdups() of name from callers into newvar().
avoid recursion of conclist(), and avoid copying of word
strings by providing Newword() function which doesnt copy
the word string.
the 6c compiler reserves R14 and R15 for extern register variables,
which is used by the kernel to hold the m and up pointers. until
now, the meaning of R14 and R15 was undefined for userspace and
extern register would not work as the kernel trashes R14 and R15
on syscalls. with this change, user extern registers R14 and R15
are zeroed on exec and otherwise preserved across syscalls. so
userspace *could* use them for per process variables like the
kernel does.
use Ureg.bp (RARG) for syscall number instead of Ureg.ax. this is
less confusing and mirrors the amd64 calling convention.
addpage() should not be called with the display locked as it
calls showpage1() which sleeps when there are too many
processes active.
the bug was triggered by plumbing to trigger the addpage().
dont kill the calling process when demand load fails if fixfault()
is called from devproc. this happens when you delete the binary
of a running process and try to debug the process accessing uncached
pages thru /proc/$pid/mem file.
fixes to procctlmemio():
- fix missed unlock as txt2data() can error
- make sure the segment isnt freed by taking a reference (under p->seglock)
- access the page with segment locked (see comment)
- get rid of the segment stealer lock
other stuff:
- move txt2data() and data2txt() to segment.c
- add procpagecount() function
- make return type mcounseg() to ulong
handle reads and writes with 9pqueue(2) so they can
be flushed and wont hang the filesystem. this also
lets us get rid of the timeouts.
ftdi is still full of braindamage that should be
rewritten, but i dont have a device to test.
instead of naming devices by ther dynamically assigned device address,
we hash device uniqueue fields from the device descriptor and produce
a 5 digit hex string that will identify the device across machines.
when there is a collision (less than 1% chance with 100 devices),
usbd will append the device address to the name to make it uniqueue
for this machine.
the hname is passed to drivers in the devid argument, which now has
the form addr:hname, where the colon and hname can be omited (for backwards
compatibility).
when the new behaviour isnt desired, nousbhname= environment variable
can be defined giving the old behaviour.
pipeline = 1 with a dovecot imap server causes FETCH and OK responses
get interleaved so some message bodies accidentally get merged together.
disabling it will make fetching mail over imap slower, but it works.
make the Page stucture less than half its original size by getting rid of
the Lock and the lru.
The Lock was required to coordinate the unchaining of pages that where
both cached and on the lru freelist.
now pages have a single next pointer that is used for palloc.head
freelist xor for page cache hash chains in Image.pghash[].
cached pages are not on the freelist anymore, but will be reclaimed
from images by the pager when the freelist runs out of pages.
each Image has its own 512 hash chains for cached page lookup. That is
2MB worth of pages and there should be no collisions for most text images.
page reclaiming can be done without holding palloc.lock as the Image is
the owner of the page hash chains protected by the Image's lock.
reclaiming Image structures can be done quickly by only reclaiming pages from
inactive images, that is images which are not currently in use by segments.
the Ref structure has no Lock anymore. Only a single long that is atomically
incremented or decremnted using cmpswap().
there are various other changes as a consequence code. and lots of pikeshedding,
sorry.
the code was correct. erealloc9p() terminates the process
on error, but the code was handling realloc() error explicitely
and responded the request with Enomem error.
the vmware svga video card emulated by qemu (qemu -vga vmware) complains and eventually causes a panic if the rectangles aren't clipped.
messages like the following can be observed from qemu before the kernel panics:
vmsvga_update_rect: update h was < 0 (-20000)
vmsvga_update_rect: update height too large y: 10000, h: 0
vmsvga_update_rect: update w was < 0 (-20000)
vmsvga_update_rect: update width too large x: 10000, w: 0
i could only reproduce this in qemu 2.0.50 on the master branch, when using the ui and had selected 'Zoom To Fit' from the View menu.
fix bug introduced by amd64 support:
forgot to update ring index i on receive. surprisingly
this was working until there where more than one packet
to process. sorry.
ilock the controller while processing rings. this should
be fixed and use kprocs instead.
we have to make sure the *swap address* doesnt go away,
after putting the swap address in the segment pte.
after we unlock the segment, the process could be
killed or fault which would cause the swap address to
be freed *before* we write the page to disk when it
pulls the page from the cache and putswap() swap pte.
keeping a reference to the page is no good. we have
to hold on the swap address. this also has the advantage
that we can now test if the swap address is still
referenced and can avoid writing to disk.
as with the Block refcount changes, _xinc() and _xdec() arent
used anymore, so remove them.
architecure can still define ainc()/adec() when it needs them.
change Proc.nlocks from Ref to int and just use normal increment and decrelemt
as done in erik quanstros 9atom.
It is not clear why we used atomic increment in the fist place as even if we
get preempted by interrupt and scheduled before we write back the incremented
value, it shouldnt be a problem and we'll just continue where we left off as
our process is the only one that can write to it.
Yoann Padioleau found that the Mach pointer Lock.m wasnt maintained
consistently for lock() vs canlock() and ilock(). Fixed.
Use uintptr instead of ulong for maxlockpc, maxilockpc and ilockpc debug variables.
webfs forks the namespace to isolate itself from its mount
point which has the side effect that it captures the mount
of previous instances of webfs mounted on /mnt/web.
explicitely unmount the mountpoint in our namespace copy
to drop the reference.
when there are multiple readers of /dev/usbevent, we have to
serialize the processing to make sure that only one driver
is opening the devices control endpoint at a time.
to do this, we assume the device is busy after reading the
event file until the next read or clunk on the same fid.
to mark a device busy, we set the dev->aux pointer to the
fid processing a event. And the Event structure takes a
reference to the device producing the event.
the problem arised from cdc ethernet and nusb/serial sharing
the same device class, and we need to run the particular driver
to figure out if the device can be used. doing this concurrently
fails because devusb allows only one open per endpoint.
the palloc.pages array takes arround 5% of the upages which
gives us:
16GB = ~0.8GB
32GB = ~1.6GB
64GB = ~3.2GB
we only have 2GB of address space above KZERO so this will not
work for long.
instead, pageinit() was altered to accept a preallocated memory
in palloc.pages. and preallocpages() in pc64/main.c allocates the
in upages memory, mapping it in the VMAP area (which has 512GB).
the drawback is that we cannot poke at Page structures now from
/proc/n/mem as the VMAP area is not accessible from it.
Without an explicit signal for a truncation, copy propagation will
sometimes propagate a 32-bit truncation and end up overwriting uses of
the original 64-bit value.
This was independently discovered and fixed in Go. See:
http://golang.org/issue/1315https://codereview.appspot.com/6002043/
Thanks Charles Forsyth for tips and advice.
as we do system reset and reboot only from boot processor cpu0 now,
theres no need for active.rebooting conditional variable.
mpshutdown() will unconditionally park application processors and
and cpu0 boots the new kernel or calls mpshutdown() causing system
reset.
in vmware, mpshutdown() used to hang in i8042reset() when not
called from the boot processor, so instead of reseting from first
cpu that acquires the shutdown lock, we park all application
processors and let the boot processor do the reset.
procwrite() did truncate the offset to 32bit ulong.
introduce off2addr() function that does the sign
extension hack and use it conststently for Qmem
reads and writes.
for the CMclose procctl, the fd number was not
bounds checked before indexing in the Fgrp.fd
array.
for the CMclosefiles, we looped fd from 0..maxfd-1,
but need to loop from 0..maxfd as maxfd is inclusive.
newns() (called by auth_chuid()) already prepares the
environment variables and puts us in a sane working
directory (as specified by the namespace file).
on amd64, the text segment is aligned and padded to
2MB, but segment granularity is 4K which can give
us page faults that are beyond the highest file
offset. this is perfectly valid, but was not handled
correctly in pio().
use two per process memory slots, one for the
pid and one for the fd instead of a global table
avoiding the case when the table gets full.
instead of calling pread() on the cached fd
(dangerous as it has side effects when the
fd was not closed), we check if the cached fd
is still good using fd2path() when called
the first time in this process.
theres big performance regression with this using
cwfs. cwfs calls time() to update atime on every
read/write which now causes walks on /dev.
reverting to the previous version for now. in the
long run, we'll use new _nsec() syscall but this
has to wait for a later release once new kernels
are established.