Use an RWlock so readers can work in parallel in
the common case (no cache updates).
When a reader needs to update the cache to add
a new learned source mac address, it will drop
the rlock and aquire the wlock to do the update.
When we get a read error, we now unbind the
port to avoid further packets being forwarded
to it.
This is usefull for hotplug ethernet devices
like usb ones or tunnels.
Simplify the unbind, getting rid of the refcount,
by having only the reader proc call freeport().
Avoid holding the bridge lock while opening
and closing ethernet/tunnel device files during
bind and unbind.
Dont use smalloc() (especially when holding locks).
Allocate bridges dynamically, so we do not waste
the memory when we do not need them.
Reject non-hostowner from allocating new bridges.
Use consistent naming: port -> port
Use consistent comment style: // -> /* */
The altsetting was handled only for a single endpoint
(per interface number), but has to be handled for each
endpoint (per interface *AND* altsetting number).
A multi function device (like a disk) can have
multiple interfaces, all with the same interface number
but varying altsetting numbers and each of these
interfaces would list distict endpoint configurations.
Multiple interfaces can even share some endpoints (they
use the same endpoint addresses), but
we still have to duplicate them for each
interface+altsetting number (as they'r part of
actually distict interfaces with distict endpoint
configurations).
It is also important to *NOT* make endpoints bi-directional
(dir == Eboth) when only one direction is used in a
interface/altsetting and the other direction in another.
This was the case for nusb/disk with some seagate drive
where endpoints where shared between the UAS and
usb storage class interface (but with distict altsettings).
The duplicate endpoints (as in using the same endpoint address)
are chained together by a next pointer and the head
is stored in Usbdev.ep[addr], where addr is the endpoint
address. These Ep structures will have distinct endpoint
numbers Ep.id (when they have conflicting types), but all
will share the endpoint address (lower 4 bits of the
endpoint number).
The consequence is that all of the endpoints configuration
(attributes, interval) is now stored in the Ep struct and
no more Altc struct is present.
A pointer to the Ep struct has to be passed to openep()
for it to configure the endpoint.
For the Iface struct, we will now create multiple of them:
one for each interface *AND* altsetting nunber,
chained together on a next pointer and the head being
stored in conf->iface[ifaceid].
--
cinap
Wlock()'ing the ifc causes a deadlock with Medium
bind/unbind as the routine can walk /net, while
ndb/dns or ndb/cs are currently blocked enumerating
/net/ipifc/*.
The fix is to have a fake medium, called "unbound",
that is set temporarily during the call of Medium
bind and unbind.
That way, the interface rwlock can be released while
bind/unbind is in progress.
The ipifcunbind() routine will refuse to unbind a
ifc that is currently assigned to the "unbound"
medium, preventing any accidents.
Pattern matching with lists no longer works:
; ls /tmp/*.c
/tmp/npage.c
/tmp/pagedebug.c
/tmp/pageold.c
/tmp/scheduler.c
/tmp/writeimagetest.c
; ls /tmp/^(*.c)
ls: /tmp/*.c: '/tmp/*.c' directory entry not found
; 9fs dump
; bind /n/dump/2021/1002/amd64/bin/rc /bin/rc
; rc
; ls /tmp/^(*.c)
/tmp/npage.c
/tmp/pagedebug.c
/tmp/pageold.c
/tmp/scheduler.c
/tmp/writeimagetest.c
the fix:
we have to propagate the glob attribute thru lists
as well. before it was only handled for single words
and propagated thru concatenations...
the Xglob instruction now works on list, and we
propagate the glob attribute thru PAREN and WORDS
and ARGLIST nodes.
also, avoid using negative numbers for the Tree.glob
field as char might be unsigned on some targets.
SSL is implemented by devssl. It's extremely
obsolete by now, and is not used anywhere but
cpu, import, and oexportfs.
This change strips out the devssl bits, but
does not (yet) remove the code from libsec.
If we don’t explicitly check for ‘h’ in troff, we can’t reliably check
for non-htmlroff well.
Consider the following:
.if h \{\
. de M
. tm m
..\}
Without this change, this will print m and not define macro M.
the pack cache was very stupid: it would close packs
as early as possible, which would prevent packs from
getting reused effectively. It would also select a
bad pack to close.
This picks the oldest pack, refcounts correctly, and
keeps up to Npackcache open at once (though it will
go over if more are in use).
This makes vmap()/vunmap() take a vlong size argument,
and change the type of Pci.mem[].size to vlong as well.
Even if vmap() wont support large mappings, it is nice to
get the original unruncated value for error checking.
pc64 needs a bigger VMAP window, as system76 pangolin
puts the framebuffer at a physical address > 512GB.
To reproduce run the following on a terminal:
<snip>
cpu% leak -s `{pstree | grep termrc | sed 1q | awk '{print $1}'}
src(0x00209a82); // 12
src(0x0020b2a6); // 1
cpu% acid `{pstree | grep termrc | sed 1q | awk '{print $1}'}
/proc/358/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x0020b2a6)
/sys/src/cmd/rc/plan9.c:169
164 if(runq->argv->words == 0)
165 poplist();
166 else {
167 free(runq->cmdfile);
168 int f = open(runq->argv->words->word, 0);
>169 runq->cmdfile = strdup(runq->argv->words->word);
170 runq->lexline = 1;
171 runq->pc--;
172 popword();
173 if(f>=0) execcmds(openfd(f));
174 }
acid:
</snap>
Another `runq->cmdfile` leak is present here (captured on a cpu server):
<snip>
277 ├listen [tcp * /rc/bin/service <nil>]
321 │├listen [/net/tcp/2 tcp!*!80]
322 │├listen [/net/tcp/3 tcp!*!17019]
324 ││└rc [/net/tcp/5 tcp!185.64.155.70!3516]
334 ││ ├rc -li
382 ││ │└pstree
336 ││ └rc
338 ││ └cat
323 │└listen [/net/tcp/4 tcp!*!17020]
278 ├listen [tcp * /rc/bin/service.auth <nil>]
320 │└listen [/net/tcp/1 tcp!*!567]
381 └closeproc
cpu% leak -s 336
src(0x00209a82); // 2
src(0x002051d2); // 1
cpu% acid 336
/proc/336/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x002051d2)
/sys/src/cmd/rc/exec.c:1056
1051
1052 void
1053 Xsrcfile(void)
1054 {
1055 free(runq->cmdfile);
>1056 runq->cmdfile = strdup(runq->code[runq->pc++].s);
1057 }
acid:
</snap>
These leaks happen because we do not free cmdfile on all execution paths
where `Xreturn()` is invoked. In `/sys/src/cmd/rc/exec.c:/^Xreturn`
<snip>
void
Xreturn(void)
{
struct thread *p = runq;
turfredir();
while(p->argv) poplist();
codefree(p->code);
runq = p->ret;
free(p);
if(runq==0)
Exit(getstatus());
}
</snip>
Note how the function `Xreturn()` frees a heap allocated instance of type
`thread` with its members *except* the `cmdfile` member.
On some code paths where `Xreturn()` is called there is an attempt to free
`cmdfile`, however, there are some code paths where `Xreturn()` is called
where `cmdfile` is not freed, leading to a leak.
The attached patch calls `free(p->cmdfile)` in `Xreturn()` to avoid leaking
memory and handling the free in one place.
After applying the patch this particular leak is removed. There are still
other leaks in rc:
<snip>
277 ├listen [tcp * /rc/bin/service <nil>]
321 │├listen [/net/tcp/2 tcp!*!80]
322 │├listen [/net/tcp/3 tcp!*!17019]
324 ││└rc [/net/tcp/5 tcp!185.64.155.70!3516]
334 ││ ├rc -li
382 ││ │└pstree
336 ││ └rc
338 ││ └cat
323 │└listen [/net/tcp/4 tcp!*!17020]
278 ├listen [tcp * /rc/bin/service.auth <nil>]
320 │└listen [/net/tcp/1 tcp!*!567]
381 └closeproc
cpu% leak -s 336
src(0x00209a82); // 2
src(0x002051d2); // 1
cpu% acid 336
/proc/336/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x00209a82)
/sys/src/cmd/rc/subr.c:9
4 #include "fns.h"
5
6 void *
7 emalloc(long n)
8 {
>9 void *p = malloc(n);
10 if(p==0)
11 panic("Can't malloc %d bytes", n);
12 return p;
13 }
14
</snap>
To help fixing those leaks emalloc(…) and erealloc(…) have been amended to use
setmalloctag(…) and setrealloctag(…). The actual fixes for other reported leaks
are *not* part of this merge and will follow.
/*
* emmc2 has different DMA constraints based on SoC revisions. It was
* moved into its own bus, so as for RPi4's firmware to update them.
* The firmware will find whether the emmc2bus alias is defined, and if
* so, it'll edit the dma-ranges property below accordingly.
*/
emmc2bus: emmc2bus {
compatible = "simple-bus";
ranges = <0x0 0x7e000000 0x0 0xfe000000 0x01800000>;
dma-ranges = <0x0 0xc0000000 0x0 0x00000000 0x40000000>;
emmc2: mmc@7e340000 {
compatible = "brcm,bcm2711-emmc2";
reg = <0x0 0x7e340000 0x100>;
interrupts = <GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clocks BCM2711_CLOCK_EMMC2>;
status = "disabled";
};
};
Some mmc controllers have no card detect pin, so the only
way to detect card presence is to issue the ACMD41 which will
fail after a pretty long timeout.
To avoid mmconline() blocking, we only try to initialize the
card synchronous once, and then retry in a background process,
while returning immediately from mmconline() while the retry
is in progress.
This speeds up network boot times significantly on a raspi
without a sdcard inserted.
If the font chosen for acme is retrieved via `getenv("font")` its
memory is leaked:
<snip>
if(fontnames[0] == nil)
fontnames[0] = getenv("font");
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> getenv(…) mallocs memory
if(fontnames[0] == nil)
fontnames[0] = "/lib/font/bit/vga/unicode.font";
if(access(fontnames[0], 0) < 0){
fprint(2, "acme: can't access %s: %r\n", fontnames[0]);
exits("font open");
}
if(fontnames[1] == nil)
fontnames[1] = fontnames[0];
fontnames[0] = estrdup(fontnames[0]);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> if the `getenv("font")` path was taken above, this assignment
> will leak its memory.
</snap>
The following leak/acid session demonstrates the issue:
<snip>
cpu% leak -s 212252
src(0x002000cb); // 1
cpu% acid 212252
/proc/212252/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x002000cb)
/sys/src/cmd/acme/acme.c:107
102 fprint(2, "usage: acme [-aib] [-c ncol] [-f font] [-F fixedfont] [-l loadfile | file...]\n");
103 exits("usage");
104 }ARGEND
105
106 if(fontnames[0] == nil)
>107 fontnames[0] = getenv("font");
108 if(fontnames[0] == nil)
109 fontnames[0] = "/lib/font/bit/vga/unicode.font";
110 if(access(fontnames[0], 0) < 0){
111 fprint(2, "acme: can't access %s: %r\n", fontnames[0]);
112 exits("font open");
acid:
</snap>
The fix tries to first check if a font has been set via
command line options in which case the font string is
malloced via estrdup(…).
If no font has been selected on the command line getenv("font")
is used. If no getenv("font") var is found we malloc a default
font via estrdup(…).
<snip>
if(fontnames[0] != nil)
fontnames[0] = estrdup(fontnames[0]);
else
if((fontnames[0] = getenv("font")) == nil)
fontnames[0] = estrdup("/lib/font/bit/vga/unicode.font");
if(access(fontnames[0], 0) < 0){
fprint(2, "acme: can't access %s: %r\n", fontnames[0]);
exits("font open");
}
if(fontnames[1] == nil)
fontnames[1] = fontnames[0];
fontnames[1] = estrdup(fontnames[1]);
</snap>
This resolves the memory leak reported by leak(1).
git/revert requires a file name argument, but when none is given
it fails in a strange way:
% git/revert
usage: cleanname [-d pwd] name...
/bin/git/revert:15: null list in concatenation
txt and caa rr strings might contain binary control characters
such as newlines and double quotes which mess up the output
in ndb(6) format.
so handle them as binary blobs internally and escape special
characters as \DDD where D is a octal digit when printing.
txtrr() will unescape them when reading into internal
binary representation.
remove the undocumented nullrr ndb attribute parsing code.
g(1): sync filetypes list
the file types list in the 'g' manual was out of date.
this change synchronizes and sorts them.
it looks like the .B macro only accepts 6 args or less,
so observe that limit.
introduce our own RR* format %P for pretty
printing and call %R format internally,
then use it to print the rest of the line
after the tab, prefixed with the padded
output.
have todo multiple fmtprint() calls for idnname()
as the buffer is shared.
do not idnname() rp->os and rp->cpu, these are symbols.
always quote txt= records.
If the source string has a run of more than 256 runes without
a "." dot, we'd overflow the runebuffer in idn2utf().
The utf2idn() routine had a check in the while loop, but that
is actually wrong too, as it would insert a dot and restart
the loop in the middle of a domain component. Just error
out if a domain component is too long.
- allow for external command to be run to install a challenge using -e flag
- remove the challengedom argument, it is given by the subject in the csr
- fix some filedescriptor leaks in error paths
In a few places, we where using a fixed buffer of sizeof(Dir)+100
size for stat. This is not correct and fails if the name returned
in stat is long.
This results in being unable to seek to the end of file with a
long filename.
The kernel should do the same thing as dirfstat() from libc;
handling the conversion and buffer allocation and returning a
freeable Dir* pointer.
For this, a new dirchanstat() function was added.
The fstat syscall was not rewriting the name to the last path
element; fix it.
In addition, gracefully handle the mountfix case, reallocating
the buffer to accomidate the required stat length plus
size of the new name so dirsetname() does not fail.
this patch fixes bugs in tls extension handling:
1. if conn->serverName is an empty string, tlsClientExtensions
will generate a SNI with an empty hostname, which is forbidden
according to RFC 6066:
opaque HostName<1..2^16-1>;
check if conn->serverName has at least one char.
2. checkClientExtensions fail with clients that doesn't have
extensions, because it doesn't check if ext is nil. fix that
up.
3. rewrite checkClientExtensions. some parts of the code does
not check the length properly, and it could be simplified
heavily.
The standard states in section 19.5.93:
.... Notice that if this operation is performed
on an obeject reference such as one produced by
the Alias, Index, or RefOf statements, the obect
type of the base object is returned.
Do the debuglevel check before calling the print
function for _threaddebug, by making it a macro.
Do not waste cycles passing arguments.
Generalize the _threaddebug function into _threadprint()
and add a varargcheck pragma. This function can
also be used from _threadassert().
Fix missing arguments in one case, fix trailing
newlines in _threaddebug().
Make _threadgetproc()/_threadsetproc() a macro,
just dereferencing Proc**_threadprocp.
Simplify the mainjump, just call _threadsetproc()
directly without that mainp dance. Remove the
_schedinit() argument, it uses _threadgetproc() now.
Get rid of Mainarg struct, just have a global variable
for argc.
The timing loop is here for the case if the
controller doesnt produce an interrupt when
becoming broken. In normal case, we should
just get worken up from the interrupt.
In any case, 100 times a second polling is
not neccessary here, increase to 1 second.
The old strategy of wait and retry doesnt seem to
work very well as it keeps all the forking parents
stuck waiting in the kernel worsening the situation.
The idea with this change is to have rfork() return
error quickly; and without whining; as most callers
would just react with a sysfatal() which might be
better for surviving this.
It is a bit of a annoyance that kenc will try to expand
function like macros on any symbol with the same name
and then complain when it doesnt see the '(' in the
invocation.
test case below:
void
foo(int)
{
}
struct Bar
{
int baz; /* <- should not conflict */
};
void
main(void)
{
baz(123);
}
The current behaviour of the kernel to deadlock itself
instead of returning an error on fork.
This might change in the future, so prepare libthread
to handle this case.
For _schedfork(), we'r going to just retry forking
on every switch, while for _schedexec(), the exec
will fail and send ~0 down the pid channel.
The ipoput4() and ipoput6() functions can raise an error(),
which means before calling sndrst() or limbo() (from tcpiput()),
we have to get rid of our blist by calling freeblist(bp).
Makse sure to set the Block pointer to nil after freeing in
ipiput() to avoid accidents.
Fix wrong panic string in sndsynack, and make any sending
functions like sndrst(), sndsynack() and tcpsendka()
return the value of ipoput*(), so we can distinguish
"no route" error.
Add a Enoroute[] string constant.
Both htontcp4() and htontcp6() can never return nil,
as they will allocate new or resize the existing block.
Remove the misleading error handling code that assumes
that it can fail.
Unlock proto on error in limborexmit() which can
be raised from sndsynack() -> ipoput*() -> error().
Make sndsynack() pass a Routehint pointer to ipoput*()
as it already did the route lookup, so we dont have todo
it twice.
i'm not confident about mutating the route tree
pointers and have concurrent readers walking the
pointer chains.
given that most route lookups are bypassed now
for non-routing case and we are not building a
high performance router here, lets play it safe.
theres no structure in the lower 32 bits of an ipv6 address.
use the top bit to distinguish special stuff like multicast
and link-local addresses, and use the 16-bit subnet-id bits
for the rest.
Instead of having to do an arp hash table lookup for each
outgoing ip packet, forward the Routehint pointer to the
medium's bwrite() function and let it cache the arp entry
pointer.
This avoids route and arp hash table lookups for tcp, il
and connection oriented udp.
It also allows us to avoid multiple route and arp table
lookups for the retransmits once an arp/neighbour solicitation
response arrives.
The Mhead structures have two sources of references to them:
- from Pgrp.mnthash hash-table
- from a channels Chan.umh pointer as returned by namec() for a union directory
Unless one holds the Mhead.lock RWLock, the Mhead.mount chain
can be mutated by eigther cmount(), cunmount() or closepgrp().
Readers, skipping acquiering the lock where:
mountfix(): responsible for rewriting directory entries for
union directory reads; was walking the Mhead.mount chain to
detect if the passed channel itself appears in the mount list.
cmount(): had a check and copy when "new" chan was a union itself
and if the MCREATE flag is set and would copy the mount table.
All this needs to be done with Mhead read-locked while copying
the mount entries.
devproc(): in the handler for reading /proc/n/ns file.
namec(): while checking if the Chan->umh should be initialized.
In addition to this, cmount() is changed to do the mountfree()
of the original mount chain when MREPL is done after releasing
the locks.
Also, some cosmetic changes...
The IPv4 ARP cache used to indefinitely buffer packets in the Arpent hold list.
This is bad in case of a router, because it opens a 1 second
(retransmit time) window to leak all the to be forwarded packets.
This change makes the ipv4 arp code path similar to the IPv6 neighbour
solicitation path, using the retransmit process to time out old entries
(after 3 arp retransmits => 3 seconds).
A new function arpcontinue() has been added that unifies the point when
we schedule the (ipv6 sol retransmit) / (ipv4 arp timeout) and reduce
the hold queue to the last packet and unlock the cache.
As a bonus, we also now send a icmp host unreachable notification
for the dropped packets.
tlsbwrite() would call checkstate() before calling tlsrecwrite()
to make sure the channel is open. however, because checkstate()
only raises the error, the Block* passed wont be freed and
would result in a memory leak.
move the checkstate() call inside tlsrecwrite() to reuse the
error handling that frees the block on error.