This code is checking the return of devwalk for
a walk resulting in a clone of an open pipe file. However,
devclone ensures that the chan we are cloning is not
currently open.
This is mostly a copy of port/usbxhci.c with PCIWADDR() replaced
by PADDR() and the pci specific code stripped out.
This could be refactored at a later time.
There is a gpio line for the main hub reset that needs to be
asserted and some power management functions that are currently
done by u-boot (using "usb start" command).
We will do these ourselfs once we have the infrastructure for
it in place.
This is a work in progress port to the mntreform2 laptop.
Working so far:
- mmu (same as raspberry pi 3b+)
- arm generic timer
- gicv3
- uart1
- enet
With access to the uart, one can netboot this kernel in u-boot
using the following commands:
> dhcp
> bootm
devproc allows changing the noteid of another process
which opens a race condition in sysrfork(), when deciding
to inherit the noteid of "up" to the child and calling
pidalloc() later to take the reference, the noteid could
have been changed and the childs noteid could have been
freed already in the process.
this bug can only happen when one writes the /proc/n/noteid
file of a another process than your own that is in the
process of forking.
the noteid changing functionality of devproc seems questinable
and seems to be only used by ape's setpgrid() implementation.
Avoid calling sdgetdev() for every I/O. Instead,
put the SDunit pointer for #S/sdXX/* files in Chan.aux
and keep a reference to SDev between sdopen()/sdclose().
This avoids having to do the sdindex() lookup and
qlock(),incref(),decref() on every read/write
operation. Removal of SDev's is quite rare and only
can happen with pcmcia ide controllers, and i assume
that for that we can assume thet fileservers having
been exited properly and closed their files before
we attempt to remove a device.
The rest is improving waserror() codepaths, making
sure we release the locks for any of the interface
callbacks (verify/online).
Also get rid of tas() and instead only change the
unit's rawopen flag while holding raw qlock.
pci uarts are detected late and usually do not contain
the console= parameter logic.
for these, we can just enable them when devuart is reset,
and replay the boot messages once enabled.
this is usefull as it allows us to use these uarts for
kernel debugging in interrupt context.
This avoids ipconfig having to explicitely specify the tag
when we want to set route type, as the tag can be provided
implicitely thru the "tag" command.
This adds a new route "t"-flag that enables network address translation,
replacing the source address (and local port) of a forwarded packet to
one of the outgoing interface.
The state for a translation is kept in a new Translation structure,
which contains two Iphash entries, so it can be inserted into the
per protocol 4-tuple hash table, requiering no extra lookups.
Translations have a low overhead (~200 bytes on amd64),
so we can have many of them. They get reused after 5 minutes
of inactivity or when the per protocol limit of 1000 entries
is reached (then the one with longest inactivity is reused).
The protocol needs to export a "forward" function that is responsible
for modifying the forwarded packet, and then handle translations in
its input function for iphash hits with Iphash.trans != 0.
This patch also fixes a few minor things found during development:
- Include the Iphash in the Conv structure, avoiding estra malloc
- Fix ttl exceeded check (ttl < 1 -> ttl <= 1)
- Router should not reply with ttl exceeded for multicast flows
- Extra checks for icmp advice to avoid protocol confusions.
Use an RWlock so readers can work in parallel in
the common case (no cache updates).
When a reader needs to update the cache to add
a new learned source mac address, it will drop
the rlock and aquire the wlock to do the update.
When we get a read error, we now unbind the
port to avoid further packets being forwarded
to it.
This is usefull for hotplug ethernet devices
like usb ones or tunnels.
Simplify the unbind, getting rid of the refcount,
by having only the reader proc call freeport().
Avoid holding the bridge lock while opening
and closing ethernet/tunnel device files during
bind and unbind.
Dont use smalloc() (especially when holding locks).
Allocate bridges dynamically, so we do not waste
the memory when we do not need them.
Reject non-hostowner from allocating new bridges.
Use consistent naming: port -> port
Use consistent comment style: // -> /* */
Wlock()'ing the ifc causes a deadlock with Medium
bind/unbind as the routine can walk /net, while
ndb/dns or ndb/cs are currently blocked enumerating
/net/ipifc/*.
The fix is to have a fake medium, called "unbound",
that is set temporarily during the call of Medium
bind and unbind.
That way, the interface rwlock can be released while
bind/unbind is in progress.
The ipifcunbind() routine will refuse to unbind a
ifc that is currently assigned to the "unbound"
medium, preventing any accidents.
2021-08-14 17:50 GMT, kemal <kemalinanc8@gmail.com>:
> 1- as driver reads 8 bytes from nvm instead of 6 so fw doesn't
> spit us an ADVANCED_SYSASSERT, it was reading 2 more
> extra bytes. apparently those 2 extra bytes were put to
> the first 2 bytes of our buffer, so we got to skip that.
some more thoughts on this, i think as 0x15*2 is not multiple
of 8, fw rounds the offset to 0x14*2. i have touched to code
to read data from 0x14*2 then ignore the first 2 bytes, just
so it's not confusing. if this causes mac to be read wrong again,
report.
also, some more changes:
1. set the fwname at iwlpci, just to align the behavior with 8000+.
this is a cosmetic change.
2. i have discovered that on device boot/reset/shutdown functions,
our driver slept way much more than it should. the reason for that is,
driver used the function delay() on places where it needs to use
microdelay() instead. i have modified the code to use microdelay().
wpi likely needs similar changes too. i hope that this does not
break the code.
3. zzz a bit more on tx/rx scheduler shutdowns and niclock.
4. openbsd's iwm and linux apparently does not check if ownership
was obtained anymore in their handover functions. instead they
just loop until the hw is ready. aligned the behavior.
see linux commit: 289e5501c3141191dd830957f1d764d3dc14a54f
5. don't take antenna masks from nvm. it's apparently empty
in some cards from 7k family. we will rely on what the fw file gives
us.
6. when the calibration is completed, wakeup the proc that runs
postboot. otherwise that thing sleeps for like 2 whole seconds
even if calibration completed earlier.
i honestly don't think any of these changes will fix 7260 not
being able to get calibration results, but i don't see anything
wrong at all in postboot7000 at this point. i will just hope
these changes somehow make it get calibration results.
NOTE: latest patch on the 9front ml, posted Mon, 14 Feb 2022 15:26:55 +0300
(non functional as of yet)
Sometimes, there is the one-off occation when one needs to
pass a huge list in rc...
This change makes devenv track total memory consumption
of environment groups allowing them to grow up to 1MB in
size (including overhead).
(Before, only the variable size was restricted, but
not the amount of files being created).
The maximum value size of a single environment variable
is set to half of the total size, which allows the
occational large value. (But not many of them).
Because we track all memory consuption, it is also
now possible to create around 10k small environment
variales.
A hashtable is added for name lookups and the qid.path
was changed to allow direct indexing into the entry
array without needing a scan lookup.
All smalloc() calls have been removed, exhaustion is
handled with error(Enomem) avoiding deadlock
in case we run out of kernel memory.
To avoid a MAXMACH limit of 32 and make
txtflush into an array for the bitmap.
Provide portable macros for testing and clearing
the bits: needtxtflush(), donetxtflush().
On pc/pc64, define inittxtflush()/settxtflush()
as no-op macros, avoiding the storage overhead of
the txtflush array alltogether.
SSL is implemented by devssl. It's extremely
obsolete by now, and is not used anywhere but
cpu, import, and oexportfs.
This change strips out the devssl bits, but
does not (yet) remove the code from libsec.
This makes vmap()/vunmap() take a vlong size argument,
and change the type of Pci.mem[].size to vlong as well.
Even if vmap() wont support large mappings, it is nice to
get the original unruncated value for error checking.
pc64 needs a bigger VMAP window, as system76 pangolin
puts the framebuffer at a physical address > 512GB.
/*
* emmc2 has different DMA constraints based on SoC revisions. It was
* moved into its own bus, so as for RPi4's firmware to update them.
* The firmware will find whether the emmc2bus alias is defined, and if
* so, it'll edit the dma-ranges property below accordingly.
*/
emmc2bus: emmc2bus {
compatible = "simple-bus";
ranges = <0x0 0x7e000000 0x0 0xfe000000 0x01800000>;
dma-ranges = <0x0 0xc0000000 0x0 0x00000000 0x40000000>;
emmc2: mmc@7e340000 {
compatible = "brcm,bcm2711-emmc2";
reg = <0x0 0x7e340000 0x100>;
interrupts = <GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clocks BCM2711_CLOCK_EMMC2>;
status = "disabled";
};
};
Some mmc controllers have no card detect pin, so the only
way to detect card presence is to issue the ACMD41 which will
fail after a pretty long timeout.
To avoid mmconline() blocking, we only try to initialize the
card synchronous once, and then retry in a background process,
while returning immediately from mmconline() while the retry
is in progress.
This speeds up network boot times significantly on a raspi
without a sdcard inserted.
In a few places, we where using a fixed buffer of sizeof(Dir)+100
size for stat. This is not correct and fails if the name returned
in stat is long.
This results in being unable to seek to the end of file with a
long filename.
The kernel should do the same thing as dirfstat() from libc;
handling the conversion and buffer allocation and returning a
freeable Dir* pointer.
For this, a new dirchanstat() function was added.
The fstat syscall was not rewriting the name to the last path
element; fix it.
In addition, gracefully handle the mountfix case, reallocating
the buffer to accomidate the required stat length plus
size of the new name so dirsetname() does not fail.
The timing loop is here for the case if the
controller doesnt produce an interrupt when
becoming broken. In normal case, we should
just get worken up from the interrupt.
In any case, 100 times a second polling is
not neccessary here, increase to 1 second.
The old strategy of wait and retry doesnt seem to
work very well as it keeps all the forking parents
stuck waiting in the kernel worsening the situation.
The idea with this change is to have rfork() return
error quickly; and without whining; as most callers
would just react with a sysfatal() which might be
better for surviving this.
The ipoput4() and ipoput6() functions can raise an error(),
which means before calling sndrst() or limbo() (from tcpiput()),
we have to get rid of our blist by calling freeblist(bp).
Makse sure to set the Block pointer to nil after freeing in
ipiput() to avoid accidents.
Fix wrong panic string in sndsynack, and make any sending
functions like sndrst(), sndsynack() and tcpsendka()
return the value of ipoput*(), so we can distinguish
"no route" error.
Add a Enoroute[] string constant.
Both htontcp4() and htontcp6() can never return nil,
as they will allocate new or resize the existing block.
Remove the misleading error handling code that assumes
that it can fail.
Unlock proto on error in limborexmit() which can
be raised from sndsynack() -> ipoput*() -> error().
Make sndsynack() pass a Routehint pointer to ipoput*()
as it already did the route lookup, so we dont have todo
it twice.
i'm not confident about mutating the route tree
pointers and have concurrent readers walking the
pointer chains.
given that most route lookups are bypassed now
for non-routing case and we are not building a
high performance router here, lets play it safe.
theres no structure in the lower 32 bits of an ipv6 address.
use the top bit to distinguish special stuff like multicast
and link-local addresses, and use the 16-bit subnet-id bits
for the rest.
Instead of having to do an arp hash table lookup for each
outgoing ip packet, forward the Routehint pointer to the
medium's bwrite() function and let it cache the arp entry
pointer.
This avoids route and arp hash table lookups for tcp, il
and connection oriented udp.
It also allows us to avoid multiple route and arp table
lookups for the retransmits once an arp/neighbour solicitation
response arrives.
The Mhead structures have two sources of references to them:
- from Pgrp.mnthash hash-table
- from a channels Chan.umh pointer as returned by namec() for a union directory
Unless one holds the Mhead.lock RWLock, the Mhead.mount chain
can be mutated by eigther cmount(), cunmount() or closepgrp().
Readers, skipping acquiering the lock where:
mountfix(): responsible for rewriting directory entries for
union directory reads; was walking the Mhead.mount chain to
detect if the passed channel itself appears in the mount list.
cmount(): had a check and copy when "new" chan was a union itself
and if the MCREATE flag is set and would copy the mount table.
All this needs to be done with Mhead read-locked while copying
the mount entries.
devproc(): in the handler for reading /proc/n/ns file.
namec(): while checking if the Chan->umh should be initialized.
In addition to this, cmount() is changed to do the mountfree()
of the original mount chain when MREPL is done after releasing
the locks.
Also, some cosmetic changes...
The IPv4 ARP cache used to indefinitely buffer packets in the Arpent hold list.
This is bad in case of a router, because it opens a 1 second
(retransmit time) window to leak all the to be forwarded packets.
This change makes the ipv4 arp code path similar to the IPv6 neighbour
solicitation path, using the retransmit process to time out old entries
(after 3 arp retransmits => 3 seconds).
A new function arpcontinue() has been added that unifies the point when
we schedule the (ipv6 sol retransmit) / (ipv4 arp timeout) and reduce
the hold queue to the last packet and unlock the cache.
As a bonus, we also now send a icmp host unreachable notification
for the dropped packets.
tlsbwrite() would call checkstate() before calling tlsrecwrite()
to make sure the channel is open. however, because checkstate()
only raises the error, the Block* passed wont be freed and
would result in a memory leak.
move the checkstate() call inside tlsrecwrite() to reuse the
error handling that frees the block on error.
was testing out the git/import tweaks and accidentally
pushed this commit. No comment on whether we want it,
but it definitely wasn't ready for merge.
Oops.
http://fqa.9front.org/fqa1.html#1.2 states the supported archs.
However, clean and nuke also remove build files for 0 (spim) and q
(power). 'mk all' using those archs fails; 'mk kernels' also tries to
build all the kernels, even those which are not supported. For
example, I tried to build the power arch (qc, qa, ql) and without
surprise it failed (when building dtracy): ...
mk dtracy
qc -FTVw dtracy.c
yacc -v -d -D1 parse.y
qc -FTVw cgen.c
qc -FTVw act.c
qc -FTVw type.c
== regfree ==
REGISTER R0 <11> STRUCT DTAct cgen.c:302
== regfree ==
REGISTER R0 <11> STRUCT DTAct act.c:266
== regfree ==
qc -FTVw agg.c
cgen.c:299 unknown type in regalloc: STRUCT DTAct
cgen.c:299 bad opcode in gmove INT -> STRUCT DTAct
cgen.c:302 unknown type in regalloc: STRUCT DTAct
cgen.c:302 bad opcode in gmove INT -> STRUCT DTAct
cgen.c:302 error in regfree: 0 [0]
REGISTERmk: qc -FTVw cgen.c : exit status=rc 387386: qc 387392: error R0
<11> STRUCT DTAct act.c:269
act.c:250 unknown type in regalloc: STRUCT DTAct
act.c:250 bad opcode in gmove INT -> STRUCT DTAct
act.c:266 unknown type in regalloc: STRUCT DTAct
act.c:266 bad opcode in gmove INT -> STRUCT DTAct
act.c:266 error in regfree: 0 [0]
act.c:269 unknown type in regalloc: STRUCT DTAct
act.c:269 bad opcode in gmove INT -> STRUCT DTAct
act.c:269 error in regfree: 0 [0]
act.c:274 unknown type in regalloc: STRUCT DTAct
act.c:274 bad opcode in gmove INT -> STRUCT DTAct
act.c:274 error in regfree: 0 [0]
too many errors
mk: for(i in cc ... : exit status=rc 382748: rc 387379: mk 387381: error
mk: date for (i ... : exit status=rc 373781: rc 382226: mk 382227: error
cpu%
The patch below skips over non-supported architectures. Is that
something we want? This way, 'mk kernels' should work without a
problem (tested on amd64). Then if someone works on getting those
architectures supported again in the future, they can be added back
in.
> After some tinkering I managed to get igfx working on this device.
> hw cursor works.
> The only caveat is that I can only get video over hdmi...
> will revisit displayport later
- avoid print() format routines (saves alot of code)
- avoid useless opens of /dev/cons (already done by initcode)
- avoid useless binds of /env and /dev (already done by initcode)
- do bind of /shr in bootrc, it is not needed by us
- we'r pid 1 so kernel will print the exit message for us
We used to use performance cycle counter for cycles(),
but it is kind of useless in userspace as each core
has its own counter and hence not comparable between
cores. Also, the cycle counter stops counting when
the cores are idle.
Most callers expect cycles() to return a high resolution
timestamp instead, so do the best we can do here
and enable the userspace generic timer virtual counter.
The new interface uses pci capability structures to locate the
registers in a rather fine granular way making it more complicated
as they can be located anywhere in any pci bar at any offset.
As far as i can see, qemu (6.0.50) never uses i/o bars in
non-legacy mode, so only mmio is implemented for now.
The previous virtio drivers implemented the legacy interface only
which uses i/o ports for all register accesses. This is still
the preferred method (and also qemu default) as it is easier to
emulate and most likely faster.
However, some vps providers like vultr force the legacy interface
to disabled with qemu -device option "disable-legacy=on" resulting
on a system without a disk and ethernet.
This used to be a internal function, but virtio
uses multiple structures with the same cap type
to indicate the location of various register
blocks in the pci bars so export it.
For 64-bit architectures, the a.out header has the HDR_MAGIC flag set
in the magic and is expanded by 8 bytes containing the 64-bit virtual
address of the programs entry point. While Exec.entry contains physical
address for kernel images.
Our sysexec() would always use Exec.entry, even for 64-bit a.out binaries,
which worked because PADDR(entry) == entry for userspace pointers.
This change fixes it, having the kernel use the 64-bit entry point
and document the behaviour in the manpage.
Remove unused fields and factor common fields into a
new PMach struct in port/portdat.h.
The fields machno, splpc and proc are not moved to
PMach as they are part of the known offsets from
assembly (l.s).