to prevent deadlock on media unbind (which is called with
the interface wlock()'ed), the medias reader processes
that unbind was waiting for used to discard packets when
the interface could not be rlocked.
this has the unfortunate side effect that when we change
addresses on a interface that packets are getting lost.
this is problematic for the processing of ipv6 router
advertisements when multiple RA's are getting received
in quick succession.
this change removes that packet dropping behaviour and
instead changes the unbind process to avoid the deadlock
by wunlock()ing the interface temporarily while waiting
for the reader processes to finish. the interface media
is also changed to the mullmedium before unlocking (see
the comment).
comparing m with MACHP() is wrong as m is a constant on 386.
add procflushothers(), which flushes all processes except up
using common procflushmmu() routine.
procflushmmu() returns once all *OTHER* processors that had
matching processes running on them flushed ther tlb/mmu state.
the caller of procflush...() takes care of flushing "up" by
calling flushmmu() later.
if the current process matched, then that means m->flushmmu
would be set, and hzclock() would call flushmmu() again.
to avoid this, we now check up->newtlb in addition to m->flushmmu
in hzclock() before calling flushmmu().
we also maintain information on which process on what processor
to wait for locally, which helps making progress when multiple
procflushmmu()'s are running concurrently.
in addition, this makes the wait condition for procflushmmu()
more sophisticated, by validating if the processor still runs
the selected process and only if it matchatches, considers
the MACHP(nm)->flushmmu flag.
the change to support no-execute bits broke the original
raspberry pi1, as it uses backwards compatible page table
format.
to use the XN bit, subpage AP bits have to be disabled
using the XP bit in CP15 Control Register c1 Bit 23.
when a process does an exec syscall, procsetup() is called and
we have to disable the debug watchpoint registers. just clearing
p->dr is not enougth as we are not going thru a procsave() and
procrestore() cycle which would disable and reload the saved
debug registers.
instead of clearing debug registers in procfork(), we should
clear the saved debug registers before a process goes to die
(pexit() calls sched() with up->state = Moribund) as the Proc
structure can get reused for kernel processes (kproc) which
never call procfork() and would therefore have debug registers
loaded.
for better system diagnostics, we *ALWAYS* want to record the parent
pid of a user process, regardless of if the child will post a wait
record on exit or not.
for that, we reverse the roles of Proc.parent and Proc.parentpid so
Proc.parentpid will always be set on rfork() and the Proc.parent
pointer will point to the parent's Proc structure or is set to nil
when no wait record should be posted on exit (RFNOWAIT flag).
this means that we can get the pid of the original parent process
from /proc, regardless of the the child having rforked with the
RFNOWAIT flag. this improves the output of pstree(1) somewhat if
the parent is still alive. note that theres no guarantee that the
parent pid is still valid.
the conditions are unchanged:
a user process that will post wait record has:
up->kp == 0 && up->parent != nil && up->parent->pid == up->parentpid
the boot process is:
up->kp == 0 && up->parent == nil && up->parentpid == 0
and kproc's have:
up->kp != 0 && up->parent == nil && up->parentpid == 0
Ori_B reports that his controller gets stuck in the ahciencreset()
wait loop. so we implement a 1000 ms timeout. we replace delay()
with esleep() as we are always called from the ledkproc().
blink() could enter infinite delay loop as well, so instead of
waiting for the message to get transmitted we exit making it
non-blocking (we will get called again anyway).
the access to the controller map[] was wrong in ledkproc(). the
index is the controller number, not the drive number!
when making outgoing connections, the source ip was selected
by just iterating from the first to the last interface and
trying each local address until a route was found. the result
was kind of hard to predict as it depends on the interface
order.
this change replaces the algorithm with the route lookup algorithm
that we already have which takes more specific desination and
source prefixes into account. so the order of interfaces does
not matter anymore.
on 386, the kernel memory region is mapped using huge 4MB pages
(when supported by the cpu). so the uncached pte manipulation
does not work to map the cursor data with uncached attribute.
instead, we allocate a memory page using newpage() and map
it globally using vmap(), which maps it uncached.
after issuing CR_RESETEP command, we have to invalidate
the endpoints output context buffer so that the halted/error
status reflects the new state. not doing so resulted in
the halted state to be stuck and we continued issuing
endpoint reset commands when we where already recovered.
handle the devusb Ep.clrhalt flag from devusb that userspace
uses to force a endpoint reset on the next transaction.
permission checking had the "other" and "owner" bits swapped plus incoming
connections where always owned by "network" instead of the owner of
the listening connection. also, ipwstat() was not effective as the uid
strings where not parsed.
this fixes the permission checks for data/ctl/err file and makes incoming
connections inherit the owner from the listening connection.
we also allow ipwstat() to change ownership to the commonuser() or anyone
if we are eve.
we might have to add additional restrictions for none at a later point...
we want devip to get reattached after hostowner has been written. factotum
already handles this with a private authdial() routine that mounts devip
when it is not present. so we detach devmnt before starting factotum,
and attach once factotum finishes.
the locking in proctext() is wrong. we have to acquire Proc.seglock
when reading segments from Proc.seg[] as segments do not
have a private freelist and can therefore be reused for other
data structures.
once we have Proc.seglock acquired, check that the process pid
is still valid so we wont accidentally read some other processes
segments. (for both proctext() and procctlmemio()). this also
should give better error message to distinguish the case when
the process did segdetach() the segment in question before we
could acquire Proc.seglock.
declare private functions as static.
there was a small window between modifying mmutop and switching the
asid where the core could bring in the new entries under the old asid
into the tlb due to speculation / prefetching.
this change moves the entering of the page tables into mmutop after
setttbr() to prevent this scenario.
due to us switching to the resereved asid 0 on procsave()->putasid(),
the only asid that could have potentially been poisoned would be asid 0
which does not have any user mappings. so this did not show any noticable
effect.
pexit() and pprint() can get called outside of a syscall
(from procctl()) with a process that is in active note
handling and require floating point in the kernel on amd64
for aesni (devtls).
make exec() clear the per process error string
to avoid spurious errors and confusion.
the errstr() syscall used to always swap the
maximum buffer size with memmove(), which is
problematic as this gives access to the garbage
beyond the NUL byte. worse, newproc(), werrstr()
and rerrstr() only clear the first byte of the
input buffer. so random stack rubble could be
leaked across processes.
we change the errstr() syscall to not copy
beyond the NUL byte.
the manpage also documents that errstr() should
truncate on a utf8 boundary so we use utfecpy()
to ensure proper NUL termination.
the idea is to catch bugs and make kernel exploitation
harder by mapping the kernel text section readonly
and everything else no-execute.
l.s maps the KZERO address space using 2MB pages so
to get the 4K granularity for the text section we use
the new ptesplit() function to split that mapping up.
we need to set EFER no-execute enable bit early
in apbootstrap so secondary application processors
will understand the NX bit in our shared kernel page
tables. also APBOOTSTRAP needs to be kept executable.
rebootjump() needs to mark REBOOTADDR page executable.
the user should not be able to change the cache
attributes for a segment in segattach() as this
can cause the same memory to be mapped with
conflicting attributes in the cache.
SG_TEXT should always be mapped with SG_RONLY
attribute. so fix data2txt() to follow the rules.
fault() now has an additional pc argument that is
used to detect fault on a non-executable segment.
that is, we check on read fault if the segment
has the SG_NOEXEC attribute and the program counter
is within faulting page.