Now that we have these new functions,
we can also make them return an error
instead of calling sysfatal() like
postmountsrv().
Remove the confusing Srv.srvfd, as it
is only temporarily used and return
it from postsrv() instead.
To use srvrease()/srvaquire() we need to have a way to spawn
new processes to handle the service loop. This functionality
was provided by the internal _forker() function which was
eigther rfork or libthread based implementation depending on
if postmountsrv() or threadpostmountsrv() where called.
For servers who want to use srv() directly, _forker would not
be initialized so srvrelease() could not be used.
To untangle this, we get rid of the global _forker handler
and put the handler in the Srv structure. Which will get
initialized (when nil) to eigther srvforker() or threadsrvforker()
depending on if the thread or non-thread entry points where used.
For symmetry, we provde new threadsrv() and threadpostsrv()
functions which handle the default initialization of Srv.forker.
This also allows a user to provide his own forker function,
maybe to conserve stack space.
To avoid dead code, we put each of these function in their
own object file. Note, this also allows a user to define its
own srvforker() symbol.
mischief provided the following test that shows the issue:
ramfs -S crash
aux/9pcon /srv/crash <<EOF
Tversion 8192 9P2000
Tattach 0 -1 $user ''
Tcreate 0 dir 020000000777 0
Tattach 5 -1 $user ''
Twalk 5 6 dir
Tread 6 0 512
EOF
the problem is that lib9p wrongly allowed reads on closed fids,
due to the permission check only considering the lower 2 bits.
a closed fid has fid->omode == -1 and it would pass on read for:
(-1 & 3) == 3 == OEXEC
the following change explicitely checks for for the closed case
and also rejects writes on directories (they are rejected on
open/create, but a broken 9p client could still issue the request).
with the latest changes to shr(3), we can use ORCLOSE on
the control file to get the mount in the share automatically
removed when the server exits or something goes wrong during
postsharesrv().
do not expose postfd() and sharefd() functions. they where
undocumented and leak the control file descriptors.
it appears that too many fileservers rely on the fileserver
process sharing the filedescriptors with children of the
caller to postmntsrv() or threadpostmntsrv().
restoring previous behaviour for now.
it is unclear how Srv.nopipe flag should work inside
postmountserv(). if a server wants to serve on stdio
descriptors, he can just call srv() after initializing
Srv.infd and Srv.outfd.
The Srv.leavefdsopen hack can be removed now that acme
win has been fixed.
in case listensrv() is called with a previously active Srv,
we have to make sure that per connection state is zeroed
out (locks and reference conuts).
also, dont assume anything about the Ref structure. there
might be implementations that have a spinlock in them.
kivik wrote:
I've found a nasty bug in lib9p handling of Tversion
messages, where an invalid version string in the request
leads to servers abort()ing the spaceship.
To reproduce:
; ramfs -S ram
; aux/9pcon /srv/ram
Tversion ~0 DIE
The issue lies in sversion() where in case an invalid
version string is received we respond right away with
ofcall.version="unknown"; however, we fail to set the
ofcall.msize, which at this point is cleared to 0. This
causes the convS2M call in respond() to fail and abort being
called.
when we get eof, stop the loop immidiately and do not
rely on the read to eventually return an error.
when convM2S() fails to decode the message, error out
and stop the loop. there is no point in continuing.
listensrv() used to override Srv.end() with its own handler
to free the malloc'd Srv structure and close the fd. this
makes it impossible to register your own cleanup handler.
instead, we introduce the private Srv.free() handler that
is used by listensrv to register its cleanup code. Srv.free()
is called once all the srv procs have been exited and all
requests on that srv have been responded to while Srv.end()
is called once all the procs exited the srv loop regardless
of the requests still being in flight.
using "interrupt" ctl message directly doesnt work when the
process is doing libthread channel operations (threadrendezvous)
as it will just repeat a interrupted rendezvous(). threadint()
handles this for us.
file->parent can be nil when the file has been previously removed.
removefile() deals with this, so skip the permission check in
that case and let removefile() error out.
the code was correct. erealloc9p() terminates the process
on error, but the code was handling realloc() error explicitely
and responded the request with Enomem error.
use the Srv.end callback for freeing the srv and closing the
file descriptor of a connection. this makes sure we wont free
the srv while there are still outstanding requests that would
access the srv when doing the respond() call.
in multithreaded programs, we have to wait until all outstanding
requests have been responded before closing down the srv.
dont make write errors sysfatal(), only print them. in case if
listensrv() is used we dont want to exit the process in respond()
called by some worker thread.
make sure Tversion is only handled when there are no outstanding
requests and make sure message size is sane.