dont kill the calling process when demand load fails if fixfault()
is called from devproc. this happens when you delete the binary
of a running process and try to debug the process accessing uncached
pages thru /proc/$pid/mem file.
fixes to procctlmemio():
- fix missed unlock as txt2data() can error
- make sure the segment isnt freed by taking a reference (under p->seglock)
- access the page with segment locked (see comment)
- get rid of the segment stealer lock
other stuff:
- move txt2data() and data2txt() to segment.c
- add procpagecount() function
- make return type mcounseg() to ulong
make the Page stucture less than half its original size by getting rid of
the Lock and the lru.
The Lock was required to coordinate the unchaining of pages that where
both cached and on the lru freelist.
now pages have a single next pointer that is used for palloc.head
freelist xor for page cache hash chains in Image.pghash[].
cached pages are not on the freelist anymore, but will be reclaimed
from images by the pager when the freelist runs out of pages.
each Image has its own 512 hash chains for cached page lookup. That is
2MB worth of pages and there should be no collisions for most text images.
page reclaiming can be done without holding palloc.lock as the Image is
the owner of the page hash chains protected by the Image's lock.
reclaiming Image structures can be done quickly by only reclaiming pages from
inactive images, that is images which are not currently in use by segments.
the Ref structure has no Lock anymore. Only a single long that is atomically
incremented or decremnted using cmpswap().
there are various other changes as a consequence code. and lots of pikeshedding,
sorry.
change Proc.nlocks from Ref to int and just use normal increment and decrelemt
as done in erik quanstros 9atom.
It is not clear why we used atomic increment in the fist place as even if we
get preempted by interrupt and scheduled before we write back the incremented
value, it shouldnt be a problem and we'll just continue where we left off as
our process is the only one that can write to it.
Yoann Padioleau found that the Mach pointer Lock.m wasnt maintained
consistently for lock() vs canlock() and ilock(). Fixed.
Use uintptr instead of ulong for maxlockpc, maxilockpc and ilockpc debug variables.
this change is in preparation for amd64. the systab calling
convention was also changed to return uintptr (as segattach
returns a pointer) and the arguments are now passed as
va_list which handles amd64 arguments properly (all arguments
are passed in 64bit quantities on the stack, tho the upper
part will not be initialized when the element is smaller
than 8 bytes).
this is partial. xalloc needs to be converted in the future.
make sure not to dereference Proc* nil pointer. this can potentially
happen from devip which has code like:
if(er->read4p)
postnote(er->read4p, 1, "unbind", 0);
the process it is about to kill can zero er->read4p at any time,
so there is the possibility of the condition to be true and then
er->read4p becoming nil.
check if the process has already exited (p->pid == 0) in postnote()
under p->debug qlock.
replaced the p->pid != 0 check with up->parentpid != 0 so
p->pid == up->parentpid is never true for p->pid == 0.
avoid allocating the wait records when up->parentpid == 0.
when a process got forked with RFNOWAIT, its p->parent will still
point to the parent process, but its p->parentpid == 0.
this causes the "parent still alive" check in pexit to get confused
as it only checked p->pid == up->parentpid. this condition is *TRUE*
in the case of RFNOWAIT when the parent process is actually dead
(p->pid == 0) so we attached the wait structure to the dead parent
leaking the memory.
the syscallno check in syscallfmt() was wrong. the unsigned
syscall number was cast to an signed integer. so negative
values would pass the check provoking bad memory access from
kernel. the check also has an off by one. one has to check
syscallno >= nsyscalls instead of syscallno > nsyscalls.
access to the p->syscalltrace string was not protected
from modification in devproc. you could awake the process
and cause it to free the string giving an opportunity for
the kernel to access bad memory. or someone could kill the
process (pexit would just free it).
now the string is protected by the usual p->debug qlock. we
also keep the string arround until it is overwritten again
or the process exists. this has the nice side effect that
one can inspect it after the process crashed.
another problem was that our validaddr() would error() instead
of pexiting the current process. the code was changed to only
access up->s.args after it was validated and copied instead of
accessing the user stack directly. this also prevents a sneaky
multithreaded process from chaning the arguments under us.
in case our validaddr() errors, we cannot assume valid user
stack after the waserror() if block. use up->s.arg[0] for the
noted() call to avoid bad access.
newproc() didnt zero parentpid and kproc() didnt set it, so
kprocs ended up with random parent pid. this is harmless as
kprocs have no up->parent but it gives confusing results in
pstree(1).
now we zero parentpid in newproc(), and set it in sysrfork()
unless RFNOWAIT has been set.
kstrcpy() did not null terminate for < 4 byte buffers. fixed,
but i dont think there is any case where this can happen in
practice.
always set malloctag in kstrdup(), cleanup.
always use ERRMAX bounded kstrcpy() to set up->errstr, q->err
and note[]->msg. paranoia.
instead of silently truncating interface name in netifinit(),
panic the kernel if interface name is too long as this case
is clearly a mistake.
panic kernel when filename is too long for addbootfile() in
devroot. this might happen if your kernel configuration is
messed up.
we have to acquire p->seglock before we lock the individual
segments of the process and lock them. if we dont then pexit()
might free the segments before we can lock them causing the
"qunlock called with qlock not held, from ..." prints.
always make sure that there are child processes we can wait for
before sleeping.
put pwait() sleep into a loop and recheck. this is not strictly
neccesary but prevents accidents if there are spurious wakeups
or a bug.