Commit graph

44 commits

Author SHA1 Message Date
cinap_lenrek 0f97eb3a60 kernel: add secalloc() and secfree() functions for secret memory allocation
The kernel needs to keep cryptographic keys and cipher states
confidential. secalloc() allocates memory from the secret pool
which is protected from debuggers reading the memory thru devproc.
secfree() releases the memory, overriding the data with garbage.
2016-08-27 20:33:03 +02:00
cinap_lenrek 1057a859b8 devsegment: cleanups
- return distinct error message when attempting to create Globalseg with physseg name
- copy directory name to up->genbuf so it stays valid after we unlock(&glogalseglock)
- cleanup wstat() handling, allow changing uid
- make sure global segment size is below SEGMAXSIZE
- move isoverlap() check from globalsegattach() into segattach()
- remove Proc* argument from globalsegattach(), segattach() and isoverlap()
- make Physseg.attr and segattach attr parameter an int for consistency
2016-03-30 22:49:13 +02:00
cinap_lenrek 04c3a6f66e zynq: introduce SG_FAULT to prevent access to AXI segment while PL is not ready
access to the axi segment hangs the machine when the fpga
is not programmed yet. to prevent access, we introduce a
new SG_FAULT flag, that when set on the Segment.type or
Physseg.attr, causes the fault handler to immidiately
return with an error (as if the segment would not be mapped).

during programming, we temporarily set the SG_FAULT flag
on the axi physseg, flush all processes tlb's that have
the segment mapped and when programming is done, we clear
the flag again.
2016-03-27 20:57:01 +02:00
cinap_lenrek 595501b005 kernel: make fversion()/mntversion() types consistent 2016-03-10 03:02:28 +01:00
cinap_lenrek d19144155e kernel: missing changes for ibrk() prototype 2015-12-21 04:49:29 +01:00
cinap_lenrek 7f3659e78f kernel: cleanup exit()/shutdown()/reboot() code
introduce cpushutdown() function that does the common
operation of initiating shutdown, returning once all
cpu's got the message and are about to shutdown. this
avoids duplicated code which isnt really machine specific.

automatic reboot on panic only when *debug= is not set
and the machine is a cpu server or has no display,
otherwise just hang.
2015-11-30 14:56:00 +01:00
cinap_lenrek 9f4eac5292 kernel: pgrpcpy(), simplify Mount structure
instead of ordering the source mount list, order the new destination
list which has the advantage that we do not need to wlock the source
namespace, so copying can be done in parallel and we do not need the
copy forward pointer in the Mount structure.

the Mhead back pointer in the Mount strcture was unused, removed.
2015-08-09 21:16:10 +02:00
cinap_lenrek 86eb8ea6bb kernel: change vmemchr() length argument to ulong and simplify 2015-08-06 10:15:07 +02:00
cinap_lenrek 4bd9ed80c3 kernel: export mntattach() from devmnt.c avoiding bogus struct passing and special case in namec()
we already export mntauth() and mntversion(), so why not stop
being sneaky and just export mntattach() so bindmount() and
devshr can just call it directly with proper arguments being
checked.

we can also avoid handling #M attach specially in namec()
by having the devmnt's attach function do error(Enoattach).
2015-07-28 09:52:21 +02:00
cinap_lenrek 6617c63a37 kernel: pipelined read ahead for the mount cache
this changes devmnt adding mntrahread() function and some helpers
for it to do pipelined sequential read ahead for the mount cache.

basically, cread() calls mntrahread() with Mntrah structure and it
figures out if we where reading sequentially and if thats the case
issues reads of c->iounit size in advance.

the read ahead state (Mntrah) is kept in the mount cache so we can
handle (read ahead) cache invalidation in the presence of writes.
2015-07-26 05:43:26 +02:00
cinap_lenrek 8ed25f24b7 kernel: various cleanups of imagereclaim(), pagereclaim(), freepages(), putimage()
imagereclaim(), pagereclaim():
- move imagereclaim() and pagereclaim() declarations to portfns.h
- consistently use ulong type for page counts
- name number of pages to free "pages" instead of "min"
- check for pages == 0 on entry

freepages():
- move pagechaindone() call to wakeup newpage() consumers inside
  palloc critical section.

putimage():
- use long type for refcount
2015-07-09 00:01:50 +02:00
cinap_lenrek 64ed3658d2 kernel: add pagechaindone() to wakeup processes waiting for memory
we keep the details about palloc in page.c, providing pagechaindone()
for mmu code to be called after a series of pagechainhead() calls.
2015-06-15 17:40:47 +02:00
cinap_lenrek 8a3b388ffe kernel: implement separate wait queues for page allocation
give kernel processes and local disk file servers (procs
having noswap flag set) a clear advantage for page allocation
under starved condition by giving them ther own wait queue so
they get readied as soon as pages become available.
2015-06-15 16:05:00 +02:00
cinap_lenrek 46070c3122 kernel: add segio() function for reading/writing segments
devproc's procctlmemio() did not handle physical segment
types correctly, as it assumed it can just kmap() the page
in question and write to it. physical segments however
need to be mapped uncached but kmap() will always map
cached as it assumes normal memory. on some machines with
aliasing memory with different cache attributes
leads to undefined behaviour!

we borrow the code from devsegment and provide a generic
segio() function to read and write user segments which
handles all the cases without using kmap by just spawning
a kproc that attaches the segment that needs to be read
from or written to. fault() will setup the right mmu
attributes for us. it will also properly flush pages for
segments that maintain instruction cache when written.
however, tlb's have to be flushed separately.

segio() is used for devsegment and devproc now, which
also allows for simplification of fixfault() as there is no
special error handling case anymore as fixfault() is now
called from faulting process *only*.

reads from /proc/$pid/mem can now span multiple pages.
2015-04-16 00:45:25 +02:00
cinap_lenrek 972cd5e3fc kernel: get rid of auxpage() and preserve cache index bits in Page.va in mount cache
the mount cache uses Page.va to store cached range offset and
limit, but mips kernel uses cache index bits from Page.va to
maintain page coloring. Page.va was not initialized by auxpage().

this change removes auxpage() which was primarily used only
by the mount cache and use newpage() with cache file offset
page as va so we will get a page of the right color.

mount cache keeps the index bits intact by only using the top
and buttom PGSHIFT bits of Page.va for the range offset/limit.
2015-03-16 05:46:08 +01:00
cinap_lenrek fcc336b902 kernel: catch address overflow in syssegfree()
the "to" address can overflow in syssegfree() causing wrong
number of pages to be passed to mfreeseg(). with the current
implementation of mfreeseg() however, this doesnt cause any
data corruption but was just freeing an unexpected number of
pages.

this change checks for this condition in syssegfree() and
errors out instead. also mfreeseg() was changed to take
ulong argument for number of pages instead of int to keep
it consistent with other routines that work with page counts.
2015-03-07 18:59:06 +01:00
cinap_lenrek cb35d1a132 kernel: avoid inconsistent reads in /proc/#/fd and /proc/#/ns
to allow bytewise access to /proc/#/fd, the contents of the file where
recreated on each call. if fd's had been closed or reassigned between
the reads, the offset would be inconsistent and a read could start off
in the middle of a line. this happens when you cat /proc/#/fd file of
a busy process that mutates its filedescriptor table.

to fix this, we now return one line record at a time. if the line
fits in the read size, then this means the next read will always start
at the beginning of the next line record. we remember the consumed
byte count in Chan.mrock and the current record in Chan.nrock. (these
fields are free to usefor non-directory files)

if a read comes in and the offset is the same as c->mrock, we do not
need to regenerate the file and just render the next c->nrock's record.

for reads smaller than the line count, we have to regenerate the content
up to the offset and the race is still possible, but this should not
be the common case.

the same algorithm is now used for /proc/#/ns file, allowing a simpler
reimplementation and getting rid of Mntwalk state strcture.
2014-12-21 04:46:22 +01:00
cinap_lenrek b18a641397 kernel: remove implicit Proc* argument from procctl()
procctl() is always called with up and it would not
work correctly if passed a different process, so
remove the Proc* argument and use up directly.
2014-11-09 08:19:28 +01:00
cinap_lenrek 3b661a96ef kernel: make noswap flag exclude processes from killbig() if not eve, reset noswap flag on exec 2014-08-17 00:50:20 +02:00
cinap_lenrek 655ec332a7 devproc: fix proccrlmemio bugs
dont kill the calling process when demand load fails if fixfault()
is called from devproc. this happens when you delete the binary
of a running process and try to debug the process accessing uncached
pages thru /proc/$pid/mem file.

fixes to procctlmemio():

- fix missed unlock as txt2data() can error
- make sure the segment isnt freed by taking a reference (under p->seglock)
- access the page with segment locked (see comment)
- get rid of the segment stealer lock

other stuff:

- move txt2data() and data2txt() to segment.c
- add procpagecount() function
- make return type mcounseg() to ulong
2014-07-14 06:02:21 +02:00
cinap_lenrek d4d86df2ab kernel: new pagecache, remove Lock from page, use cmpswap for Ref instead of Lock
make the Page stucture less than half its original size by getting rid of
the Lock and the lru.

The Lock was required to coordinate the unchaining of pages that where
both cached and on the lru freelist.

now pages have a single next pointer that is used for palloc.head
freelist xor for page cache hash chains in Image.pghash[].

cached pages are not on the freelist anymore, but will be reclaimed
from images by the pager when the freelist runs out of pages.

each Image has its own 512 hash chains for cached page lookup. That is
2MB worth of pages and there should be no collisions for most text images.

page reclaiming can be done without holding palloc.lock as the Image is
the owner of the page hash chains protected by the Image's lock.

reclaiming Image structures can be done quickly by only reclaiming pages from
inactive images, that is images which are not currently in use by segments.

the Ref structure has no Lock anymore. Only a single long that is atomically
incremented or decremnted using cmpswap().

there are various other changes as a consequence code. and lots of pikeshedding,
sorry.
2014-06-22 15:12:45 +02:00
cinap_lenrek 72ba3571a3 kernel: remove _xinc()/_xdec()
as with the Block refcount changes, _xinc() and _xdec() arent
used anymore, so remove them.

architecure can still define ainc()/adec() when it needs them.
2014-06-08 01:35:22 +02:00
cinap_lenrek a2d96d47c9 kernel: always reset notepending in eqlock, handle forceclosefgrp in eqlocks 2014-04-29 21:17:07 +02:00
cinap_lenrek 316d8ad76b pc64: fix segattach
the comment about Physseg.size being in pages is wrong,
change type to uintptr and correct the comment.

change the length parameter of segattach() and isoverlap()
to uintptr as well. segments can grow over 4GB in pc64 now
and globalsegattach() in devsegment calculates len argument
of isoverlap() by s->top - s->bot. note that the syscall
still takes 32bit ulong argument for the length!

check for integer overflow in segattach(), make sure segment
goes not beyond USTKTOP.

change PTEMAPMEM constant to uvlong as it is used to calculate
SEGMAXSIZE.
2014-03-04 22:37:15 +01:00
cinap_lenrek 9405f4c95f kernel: getting rid of duppage() (thanks charles)
simplifying paging code by getting rid of duppage(). instead,
fixfault() now always makes a copy of the shared/cached page
and leaves the cache alone. newpage() uncaches pages as
neccesary.

thanks charles forsyth for the suggestion.

from http://9fans.net/archive/2014/03/26:

> It isn't needed at all. When a cached page is written, it's trying hard to
> replace the page in the cache by a new copy,
> to return the previously cached page. Instead, I copy the cached page and
> return the copy, which is what it already
> does in another instance. ...
2014-03-02 20:55:26 +01:00
cinap_lenrek 521a34d33b kernel: keep cached pages continuous at the end of the page list on imagereclaim()
imagereclaim() sabotaged itself by breaking the invariant
that cached pages are kept at the end of the page list.

once we made a hole of uncached pages, we would stop
reclaiming cached pages before it as the loop breaks
once it hits a uncached page. (we iterate backwards from
the tail to the head of the pagelist until pages have been
reclaimed or we hit a uncached page).

the solution is to move pages to the head of the pagelist
after removing them from the image cache.
2014-02-24 22:42:22 +01:00
cinap_lenrek 520957e254 kernel: fix ulong abuse in xalloc 2014-01-21 22:12:25 +01:00
cinap_lenrek ebfb4fdf29 kernel: convert putmmu() to uintptr for va and pa 2014-01-20 03:17:55 +01:00
cinap_lenrek 6c2e983d32 kernel: apply uintptr for ulong when a pointer is stored
this change is in preparation for amd64. the systab calling
convention was also changed to return uintptr (as segattach
returns a pointer) and the arguments are now passed as
va_list which handles amd64 arguments properly (all arguments
are passed in 64bit quantities on the stack, tho the upper
part will not be initialized when the element is smaller
than 8 bytes).

this is partial. xalloc needs to be converted in the future.
2014-01-20 00:47:55 +01:00
cinap_lenrek 7761128093 devproc: make sure /proc/n/wait waits for the right process children
theres a race when we wait for a process children and that
process exits before we sleep().
2013-12-07 07:17:32 +01:00
cinap_lenrek aa0627162b remove non standard COM3 (eia2) serial port from i8250 uart.
access to non standard serial port COM3 at i/o port 0x200 causes
kernel panic on some machines (Toshiba Sattelite 1415-S115). also,
some machines have gameport at 0x200.

i readded uartisa to the pcf and pccpuf kernel configurations so
one can use plan9.ini to add non standard uarts like:

uart2=type=isa port=0x200 irq=5
2013-01-13 10:23:31 +01:00
cinap_lenrek 4b4070a8b9 ratrace: fix race conditions and range check
the syscallno check in syscallfmt() was wrong. the unsigned
syscall number was cast to an signed integer. so negative
values would pass the check provoking bad memory access from
kernel. the check also has an off by one. one has to check
syscallno >= nsyscalls instead of syscallno > nsyscalls.

access to the p->syscalltrace string was not protected
from modification in devproc. you could awake the process
and cause it to free the string giving an opportunity for
the kernel to access bad memory. or someone could kill the
process (pexit would just free it).

now the string is protected by the usual p->debug qlock. we
also keep the string arround until it is overwritten again
or the process exists. this has the nice side effect that
one can inspect it after the process crashed.

another problem was that our validaddr() would error() instead
of pexiting the current process. the code was changed to only
access up->s.args after it was validated and copied instead of
accessing the user stack directly. this also prevents a sneaky
multithreaded process from chaning the arguments under us.

in case our validaddr() errors, we cannot assume valid user
stack after the waserror() if block. use up->s.arg[0] for the
noted() call to avoid bad access.
2012-11-23 20:27:09 +01:00
cinap_lenrek 77c21a062c kernel: remove duppage debug, add comments, cleanup 2012-02-16 18:04:08 +01:00
cinap_lenrek 2450b55c7b kernel: add pidalloc() and reuse pid once the counter wraps arround 2011-12-20 22:22:08 +01:00
cinap_lenrek a6e3c9fd83 calculate the real number of pages used by segments and use it for killbig and proc 2011-08-26 04:47:34 +02:00
cinap_lenrek c44b78f739 change definition of Chan.create to return a chan like open 2011-08-17 23:27:31 +02:00
cinap_lenrek 70e4b8d1f9 added eqlock(), a interruptable version of qlock. addresses issue #81 2011-08-10 16:21:17 +02:00
aiju 27fd88af23 devshr: rename hook 2011-07-28 14:22:39 +02:00
cinap_lenrek e7d3e20912 remove keyboard stuff from other ports, make openssl and python compile on arm 2011-05-21 00:42:08 +00:00
cinap_lenrek 4bc74b8aef audioif, mixer control 2011-05-20 18:30:46 +00:00
cinap_lenrek be81150bb4 remove audio.h, put stuff in port^(dat fns).h 2011-05-16 22:31:27 +00:00
cinap_lenrek 26c2845917 kbdfs 2011-05-09 08:32:14 +00:00
Taru Karttunen a9060cc06b Import sources from 2011-03-30 iso image - lib 2011-03-30 19:35:09 +03:00
Taru Karttunen e5888a1ffd Import sources from 2011-03-30 iso image 2011-03-30 15:46:40 +03:00