To reproduce run the following on a terminal:
<snip>
cpu% leak -s `{pstree | grep termrc | sed 1q | awk '{print $1}'}
src(0x00209a82); // 12
src(0x0020b2a6); // 1
cpu% acid `{pstree | grep termrc | sed 1q | awk '{print $1}'}
/proc/358/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x0020b2a6)
/sys/src/cmd/rc/plan9.c:169
164 if(runq->argv->words == 0)
165 poplist();
166 else {
167 free(runq->cmdfile);
168 int f = open(runq->argv->words->word, 0);
>169 runq->cmdfile = strdup(runq->argv->words->word);
170 runq->lexline = 1;
171 runq->pc--;
172 popword();
173 if(f>=0) execcmds(openfd(f));
174 }
acid:
</snap>
Another `runq->cmdfile` leak is present here (captured on a cpu server):
<snip>
277 ├listen [tcp * /rc/bin/service <nil>]
321 │├listen [/net/tcp/2 tcp!*!80]
322 │├listen [/net/tcp/3 tcp!*!17019]
324 ││└rc [/net/tcp/5 tcp!185.64.155.70!3516]
334 ││ ├rc -li
382 ││ │└pstree
336 ││ └rc
338 ││ └cat
323 │└listen [/net/tcp/4 tcp!*!17020]
278 ├listen [tcp * /rc/bin/service.auth <nil>]
320 │└listen [/net/tcp/1 tcp!*!567]
381 └closeproc
cpu% leak -s 336
src(0x00209a82); // 2
src(0x002051d2); // 1
cpu% acid 336
/proc/336/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x002051d2)
/sys/src/cmd/rc/exec.c:1056
1051
1052 void
1053 Xsrcfile(void)
1054 {
1055 free(runq->cmdfile);
>1056 runq->cmdfile = strdup(runq->code[runq->pc++].s);
1057 }
acid:
</snap>
These leaks happen because we do not free cmdfile on all execution paths
where `Xreturn()` is invoked. In `/sys/src/cmd/rc/exec.c:/^Xreturn`
<snip>
void
Xreturn(void)
{
struct thread *p = runq;
turfredir();
while(p->argv) poplist();
codefree(p->code);
runq = p->ret;
free(p);
if(runq==0)
Exit(getstatus());
}
</snip>
Note how the function `Xreturn()` frees a heap allocated instance of type
`thread` with its members *except* the `cmdfile` member.
On some code paths where `Xreturn()` is called there is an attempt to free
`cmdfile`, however, there are some code paths where `Xreturn()` is called
where `cmdfile` is not freed, leading to a leak.
The attached patch calls `free(p->cmdfile)` in `Xreturn()` to avoid leaking
memory and handling the free in one place.
After applying the patch this particular leak is removed. There are still
other leaks in rc:
<snip>
277 ├listen [tcp * /rc/bin/service <nil>]
321 │├listen [/net/tcp/2 tcp!*!80]
322 │├listen [/net/tcp/3 tcp!*!17019]
324 ││└rc [/net/tcp/5 tcp!185.64.155.70!3516]
334 ││ ├rc -li
382 ││ │└pstree
336 ││ └rc
338 ││ └cat
323 │└listen [/net/tcp/4 tcp!*!17020]
278 ├listen [tcp * /rc/bin/service.auth <nil>]
320 │└listen [/net/tcp/1 tcp!*!567]
381 └closeproc
cpu% leak -s 336
src(0x00209a82); // 2
src(0x002051d2); // 1
cpu% acid 336
/proc/336/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x00209a82)
/sys/src/cmd/rc/subr.c:9
4 #include "fns.h"
5
6 void *
7 emalloc(long n)
8 {
>9 void *p = malloc(n);
10 if(p==0)
11 panic("Can't malloc %d bytes", n);
12 return p;
13 }
14
</snap>
To help fixing those leaks emalloc(…) and erealloc(…) have been amended to use
setmalloctag(…) and setrealloctag(…). The actual fixes for other reported leaks
are *not* part of this merge and will follow.
/*
* emmc2 has different DMA constraints based on SoC revisions. It was
* moved into its own bus, so as for RPi4's firmware to update them.
* The firmware will find whether the emmc2bus alias is defined, and if
* so, it'll edit the dma-ranges property below accordingly.
*/
emmc2bus: emmc2bus {
compatible = "simple-bus";
ranges = <0x0 0x7e000000 0x0 0xfe000000 0x01800000>;
dma-ranges = <0x0 0xc0000000 0x0 0x00000000 0x40000000>;
emmc2: mmc@7e340000 {
compatible = "brcm,bcm2711-emmc2";
reg = <0x0 0x7e340000 0x100>;
interrupts = <GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clocks BCM2711_CLOCK_EMMC2>;
status = "disabled";
};
};
Some mmc controllers have no card detect pin, so the only
way to detect card presence is to issue the ACMD41 which will
fail after a pretty long timeout.
To avoid mmconline() blocking, we only try to initialize the
card synchronous once, and then retry in a background process,
while returning immediately from mmconline() while the retry
is in progress.
This speeds up network boot times significantly on a raspi
without a sdcard inserted.
If the font chosen for acme is retrieved via `getenv("font")` its
memory is leaked:
<snip>
if(fontnames[0] == nil)
fontnames[0] = getenv("font");
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> getenv(…) mallocs memory
if(fontnames[0] == nil)
fontnames[0] = "/lib/font/bit/vga/unicode.font";
if(access(fontnames[0], 0) < 0){
fprint(2, "acme: can't access %s: %r\n", fontnames[0]);
exits("font open");
}
if(fontnames[1] == nil)
fontnames[1] = fontnames[0];
fontnames[0] = estrdup(fontnames[0]);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> if the `getenv("font")` path was taken above, this assignment
> will leak its memory.
</snap>
The following leak/acid session demonstrates the issue:
<snip>
cpu% leak -s 212252
src(0x002000cb); // 1
cpu% acid 212252
/proc/212252/text:amd64 plan 9 executable
/sys/lib/acid/port
/sys/lib/acid/amd64
acid: src(0x002000cb)
/sys/src/cmd/acme/acme.c:107
102 fprint(2, "usage: acme [-aib] [-c ncol] [-f font] [-F fixedfont] [-l loadfile | file...]\n");
103 exits("usage");
104 }ARGEND
105
106 if(fontnames[0] == nil)
>107 fontnames[0] = getenv("font");
108 if(fontnames[0] == nil)
109 fontnames[0] = "/lib/font/bit/vga/unicode.font";
110 if(access(fontnames[0], 0) < 0){
111 fprint(2, "acme: can't access %s: %r\n", fontnames[0]);
112 exits("font open");
acid:
</snap>
The fix tries to first check if a font has been set via
command line options in which case the font string is
malloced via estrdup(…).
If no font has been selected on the command line getenv("font")
is used. If no getenv("font") var is found we malloc a default
font via estrdup(…).
<snip>
if(fontnames[0] != nil)
fontnames[0] = estrdup(fontnames[0]);
else
if((fontnames[0] = getenv("font")) == nil)
fontnames[0] = estrdup("/lib/font/bit/vga/unicode.font");
if(access(fontnames[0], 0) < 0){
fprint(2, "acme: can't access %s: %r\n", fontnames[0]);
exits("font open");
}
if(fontnames[1] == nil)
fontnames[1] = fontnames[0];
fontnames[1] = estrdup(fontnames[1]);
</snap>
This resolves the memory leak reported by leak(1).
git/revert requires a file name argument, but when none is given
it fails in a strange way:
% git/revert
usage: cleanname [-d pwd] name...
/bin/git/revert:15: null list in concatenation
txt and caa rr strings might contain binary control characters
such as newlines and double quotes which mess up the output
in ndb(6) format.
so handle them as binary blobs internally and escape special
characters as \DDD where D is a octal digit when printing.
txtrr() will unescape them when reading into internal
binary representation.
remove the undocumented nullrr ndb attribute parsing code.
g(1): sync filetypes list
the file types list in the 'g' manual was out of date.
this change synchronizes and sorts them.
it looks like the .B macro only accepts 6 args or less,
so observe that limit.
introduce our own RR* format %P for pretty
printing and call %R format internally,
then use it to print the rest of the line
after the tab, prefixed with the padded
output.
have todo multiple fmtprint() calls for idnname()
as the buffer is shared.
do not idnname() rp->os and rp->cpu, these are symbols.
always quote txt= records.
If the source string has a run of more than 256 runes without
a "." dot, we'd overflow the runebuffer in idn2utf().
The utf2idn() routine had a check in the while loop, but that
is actually wrong too, as it would insert a dot and restart
the loop in the middle of a domain component. Just error
out if a domain component is too long.
- allow for external command to be run to install a challenge using -e flag
- remove the challengedom argument, it is given by the subject in the csr
- fix some filedescriptor leaks in error paths
In a few places, we where using a fixed buffer of sizeof(Dir)+100
size for stat. This is not correct and fails if the name returned
in stat is long.
This results in being unable to seek to the end of file with a
long filename.
The kernel should do the same thing as dirfstat() from libc;
handling the conversion and buffer allocation and returning a
freeable Dir* pointer.
For this, a new dirchanstat() function was added.
The fstat syscall was not rewriting the name to the last path
element; fix it.
In addition, gracefully handle the mountfix case, reallocating
the buffer to accomidate the required stat length plus
size of the new name so dirsetname() does not fail.
this patch fixes bugs in tls extension handling:
1. if conn->serverName is an empty string, tlsClientExtensions
will generate a SNI with an empty hostname, which is forbidden
according to RFC 6066:
opaque HostName<1..2^16-1>;
check if conn->serverName has at least one char.
2. checkClientExtensions fail with clients that doesn't have
extensions, because it doesn't check if ext is nil. fix that
up.
3. rewrite checkClientExtensions. some parts of the code does
not check the length properly, and it could be simplified
heavily.
The standard states in section 19.5.93:
.... Notice that if this operation is performed
on an obeject reference such as one produced by
the Alias, Index, or RefOf statements, the obect
type of the base object is returned.
Do the debuglevel check before calling the print
function for _threaddebug, by making it a macro.
Do not waste cycles passing arguments.
Generalize the _threaddebug function into _threadprint()
and add a varargcheck pragma. This function can
also be used from _threadassert().
Fix missing arguments in one case, fix trailing
newlines in _threaddebug().
Make _threadgetproc()/_threadsetproc() a macro,
just dereferencing Proc**_threadprocp.
Simplify the mainjump, just call _threadsetproc()
directly without that mainp dance. Remove the
_schedinit() argument, it uses _threadgetproc() now.
Get rid of Mainarg struct, just have a global variable
for argc.
The timing loop is here for the case if the
controller doesnt produce an interrupt when
becoming broken. In normal case, we should
just get worken up from the interrupt.
In any case, 100 times a second polling is
not neccessary here, increase to 1 second.
The old strategy of wait and retry doesnt seem to
work very well as it keeps all the forking parents
stuck waiting in the kernel worsening the situation.
The idea with this change is to have rfork() return
error quickly; and without whining; as most callers
would just react with a sysfatal() which might be
better for surviving this.
It is a bit of a annoyance that kenc will try to expand
function like macros on any symbol with the same name
and then complain when it doesnt see the '(' in the
invocation.
test case below:
void
foo(int)
{
}
struct Bar
{
int baz; /* <- should not conflict */
};
void
main(void)
{
baz(123);
}
The current behaviour of the kernel to deadlock itself
instead of returning an error on fork.
This might change in the future, so prepare libthread
to handle this case.
For _schedfork(), we'r going to just retry forking
on every switch, while for _schedexec(), the exec
will fail and send ~0 down the pid channel.
The ipoput4() and ipoput6() functions can raise an error(),
which means before calling sndrst() or limbo() (from tcpiput()),
we have to get rid of our blist by calling freeblist(bp).
Makse sure to set the Block pointer to nil after freeing in
ipiput() to avoid accidents.
Fix wrong panic string in sndsynack, and make any sending
functions like sndrst(), sndsynack() and tcpsendka()
return the value of ipoput*(), so we can distinguish
"no route" error.
Add a Enoroute[] string constant.
Both htontcp4() and htontcp6() can never return nil,
as they will allocate new or resize the existing block.
Remove the misleading error handling code that assumes
that it can fail.
Unlock proto on error in limborexmit() which can
be raised from sndsynack() -> ipoput*() -> error().
Make sndsynack() pass a Routehint pointer to ipoput*()
as it already did the route lookup, so we dont have todo
it twice.
i'm not confident about mutating the route tree
pointers and have concurrent readers walking the
pointer chains.
given that most route lookups are bypassed now
for non-routing case and we are not building a
high performance router here, lets play it safe.
theres no structure in the lower 32 bits of an ipv6 address.
use the top bit to distinguish special stuff like multicast
and link-local addresses, and use the 16-bit subnet-id bits
for the rest.
Instead of having to do an arp hash table lookup for each
outgoing ip packet, forward the Routehint pointer to the
medium's bwrite() function and let it cache the arp entry
pointer.
This avoids route and arp hash table lookups for tcp, il
and connection oriented udp.
It also allows us to avoid multiple route and arp table
lookups for the retransmits once an arp/neighbour solicitation
response arrives.
The Mhead structures have two sources of references to them:
- from Pgrp.mnthash hash-table
- from a channels Chan.umh pointer as returned by namec() for a union directory
Unless one holds the Mhead.lock RWLock, the Mhead.mount chain
can be mutated by eigther cmount(), cunmount() or closepgrp().
Readers, skipping acquiering the lock where:
mountfix(): responsible for rewriting directory entries for
union directory reads; was walking the Mhead.mount chain to
detect if the passed channel itself appears in the mount list.
cmount(): had a check and copy when "new" chan was a union itself
and if the MCREATE flag is set and would copy the mount table.
All this needs to be done with Mhead read-locked while copying
the mount entries.
devproc(): in the handler for reading /proc/n/ns file.
namec(): while checking if the Chan->umh should be initialized.
In addition to this, cmount() is changed to do the mountfree()
of the original mount chain when MREPL is done after releasing
the locks.
Also, some cosmetic changes...
The IPv4 ARP cache used to indefinitely buffer packets in the Arpent hold list.
This is bad in case of a router, because it opens a 1 second
(retransmit time) window to leak all the to be forwarded packets.
This change makes the ipv4 arp code path similar to the IPv6 neighbour
solicitation path, using the retransmit process to time out old entries
(after 3 arp retransmits => 3 seconds).
A new function arpcontinue() has been added that unifies the point when
we schedule the (ipv6 sol retransmit) / (ipv4 arp timeout) and reduce
the hold queue to the last packet and unlock the cache.
As a bonus, we also now send a icmp host unreachable notification
for the dropped packets.
tlsbwrite() would call checkstate() before calling tlsrecwrite()
to make sure the channel is open. however, because checkstate()
only raises the error, the Block* passed wont be freed and
would result in a memory leak.
move the checkstate() call inside tlsrecwrite() to reuse the
error handling that frees the block on error.
in OpenBSD 6.9 and up, the kernel (bsd, bsd.mp) still has
the ostype symbols, but bsd.rd appears to have lost them,
even when decompressed.
so, as a result, we should use what we have, which isn't
much.
Due to the way LCA is defined, a using a strict LCA
on a graph like this:
<--a--b--c--d--e--f--g
\ /
+-----h-------
can lead to spurious requests to merge. This happens
because 'lca(b, g)' would return 'a', since it can be
reached in one step from 'b', and 2 steps from 'g', while
reaching 'b' from 'a' would be a longer path.
As a result, we need to implement an lca variant that
returns the starting node if one is reachable from the
other, even if it's already found the technically correct
least common ancestor.
This replaces our LCA algorithm with one based on the
painting we do while finding a twixt, making it give
the resutls we want.
git/query: fix spurious merge requests
Due to the way LCA is defined, a using a strict LCA
on a graph like this:
<--a--b--c--d--e--f--g
\ /
+-----h-------
can lead to spurious requests to merge. This happens
because 'lca(b, g)' would return 'a', since it can be
reached in one step from 'b', and 2 steps from 'g', while
reaching 'b' from 'a' would be a longer path.
As a result, we need to implement an lca variant that
returns the starting node if one is reachable from the
other, even if it's already found the technically correct
least common ancestor.
This replaces our LCA algorithm with one based on the
painting we do while finding a twixt.
directory entries cannot span sector boundaries, meaning
that the end of a sector would be zero padded until the
next sector.
we have to skip over these zero paddings to fully read
the directory.
Plumber both posts a service to /srv and sets a $plumbsrv environment
variable. Our libplumb no longer uses $plumbsrv and nothing else
does. It's a silly hack; rc doesn't update /env immediately, and
scripts, which for instance set up subrios, cannot rely on it to
clean up the plumber at the end.
Instead, add the option to specify a srvname, actually check for some
common errors and print a usage string.
Thanks to Ori for input and a preliminary patch.
- enforce same behaviour as cachedb server in dblookup():
- force Taaaa record type on ipv6= attributes, regardless of value
- return Taaaa records for ip= attributes containing ipv6 values
- return Ta records only for ip= attributes containing ipv4 values
- for compatibility, bring back support for txtrr= type, but handle consistently
Git has the ability to track the person who
creates a commit separately from the person
who wrote the commit. For git9, we ignored
this feature.
However, as we start using git/import more,
it will be useful to figure out who imported
a commit, as well as who wrote it.
This change adds support for seeing this
information in git, as well as setting the
author and committer separately in git/import.
Target generation is revised, split into $YTARG and $TARG.
$PROGS is inlined to the cmd target.
%.cpus is added to allow chaining: mk all.cpus
$POWERLESS is added, since dtracy doesn't build yet on ppc.
$DIRS regexp is simplified, simplifing $NOMK.
$cpuobjtype is replaced with cp's recipe for copying itself on $cputype.
$APEDIRS is removed.
The none target is renamed to usage, since it prints out usage.
The ape target is removed.
The dirs target is replaced by all.dirs
%.directories is replaced by %.dirs
The all target serializes directories after cmds to match the install target recipe.
All regexp rules are replaced with nonregexp versions for clarity.
The &:n: rule is removed. Just build the $O.$cmd file.
.y files now build .c files, not .tab.c files, and remove (bc|units|mpc|pc).c:R:
All safeinstall rules are removed.
The cleanfiles rule is renamed to cleancmds and simplified.
%.clean is removed. Just use mk cleancmds.
The install rule serializes cp and yacc before building anything else, avoiding races.
The installall recipe is simplified with the install.cpus prereq.
%.installall is removed. Just use mk $cmd.install.cpus
The $O.cj, %.update, and compilers rules are removed.
openssh now disables RSA/SHA-1 by default, so using RSA/SHA-1 will
eventually cause us problems:
https://undeadly.org/cgi?action=article;sid=20210830113413
in addition, github will disable RSA/SHA-1 for recently added RSA keys:
https://github.blog/2021-09-01-improving-git-protocol-security-github/
this patch modifies ssh.c to use RSA/SHA-256 (aka rsa-sha2-256)
instead of RSA/SHA-1 (aka ssh-rsa) as the public key algorithm.
NOTE: public rsa keys and thumbprints are ***NOT AFFECTED***
by this patch.
while we're here, remove the workaround for github.com. it seems
that github has fixed their implementation, and does not look into
macalgs when we're using an aead cipher.
---
remove old /sys/src/games/nes/joynes in favor of joy(1).
joy(1) has more buttons for the other emulators; there is
no longer a significance in the order of the keys.
document nusb/joy, add information in each emulator manpage.
> This patch enables use of the igfx controller rather than vesa on the
> eeepc1005ha netbook. This means using the full screen resolution of
> 1024x600.
> *Andrew Eggenberger*
Per the docs:
the sender SHOULD include a LF, but the
receiver MUST NOT complain if it is not
present.
I typoed away the SHOULD, and got missed the
MUST NOT.
thanks qbit.
the subst utility no longer supports a '-g'
flag, but this was left behind in commit;
this means that the lines listing modified
files were not correctly commented in the
commit header.
This is mostly harmless, but when using an
editor like sam to edit the commit message,
the modified lines would have to be removed
manually.
was testing out the git/import tweaks and accidentally
pushed this commit. No comment on whether we want it,
but it definitely wasn't ready for merge.
Oops.
Often, people (including myself) will write emails that
can almost be applied with git/import. This changes
git/diff and git/import so that things will generally
work even when assembling diffs by hand:
1. git/import becomes slightly more lax:
^diff ...
^--- ...
will both be detected as the start of a patch.
2. git/diff produces the same format of diff
as git/export, starting with paths:
--- a/path/to/file
+++ b/path/to/file
which means that the 'ape/patch -p1' used
within git/import will just work.
So with this, if you send an email to the mailing list,
write up a committable description, and append the
output of git/diff to the end of the email, git/import
should just work.
[this patch was send through the mailing list using the
above procedure, and will be committed with git/import
to verify that it works as advertised]
2021-08-14 17:50 GMT, kemal <kemalinanc8@gmail.com>:
> 1- as driver reads 8 bytes from nvm instead of 6 so fw doesn't
> spit us an ADVANCED_SYSASSERT, it was reading 2 more
> extra bytes. apparently those 2 extra bytes were put to
> the first 2 bytes of our buffer, so we got to skip that.
some more thoughts on this, i think as 0x15*2 is not multiple
of 8, fw rounds the offset to 0x14*2. i have touched to code
to read data from 0x14*2 then ignore the first 2 bytes, just
so it's not confusing. if this causes mac to be read wrong again,
report.
also, some more changes:
1. set the fwname at iwlpci, just to align the behavior with 8000+.
this is a cosmetic change.
2. i have discovered that on device boot/reset/shutdown functions,
our driver slept way much more than it should. the reason for that is,
driver used the function delay() on places where it needs to use
microdelay() instead. i have modified the code to use microdelay().
wpi likely needs similar changes too. i hope that this does not
break the code.
3. zzz a bit more on tx/rx scheduler shutdowns and niclock.
4. openbsd's iwm and linux apparently does not check if ownership
was obtained anymore in their handover functions. instead they
just loop until the hw is ready. aligned the behavior.
see linux commit: 289e5501c3141191dd830957f1d764d3dc14a54f
5. don't take antenna masks from nvm. it's apparently empty
in some cards from 7k family. we will rely on what the fw file gives
us.
6. when the calibration is completed, wakeup the proc that runs
postboot. otherwise that thing sleeps for like 2 whole seconds
even if calibration completed earlier.
i honestly don't think any of these changes will fix 7260 not
being able to get calibration results, but i don't see anything
wrong at all in postboot7000 at this point. i will just hope
these changes somehow make it get calibration results.
NOTE: latest patch on the 9front ml, posted Mon, 14 Feb 2022 15:26:55 +0300
(non functional as of yet)