Commit graph

137 commits

Author SHA1 Message Date
cinap_lenrek a8a6429204 devip: make il connect fail quickly when theres no route 2018-07-10 09:11:19 +02:00
cinap_lenrek 9898aafa0c devip: don't pad the tag for routing commands (fixes removing routes with < 4 character tags) 2018-07-09 01:32:21 +02:00
cinap_lenrek 8dd003eb04 devip: fix flush, copy tag when replacing route entry 2018-06-19 21:17:15 +02:00
cinap_lenrek 4971db9e32 udp: fix udp checksum
we did not apply the special case to store 0xFFFF (-0)
in the checksum field when the checksum calculation
returned zero. we survived this for v4 as RFC768 states:

> If the computed checksum is zero, it is transmitted as
> all ones (the equivalent in one's complement arithmetic).
>
> An all zero transmitted checksum value means that the
> transmitter generated no checksum (for debuging or for
> higher level protocols that don't care).

for ipv6 however, the checksum is not optional and receivers
would drop packets with a zero checksum.
2018-06-14 20:48:21 +02:00
cinap_lenrek de9141bc6d devip: don't send arp requests from null address
during dhcp, ipconfig assigns the null address :: which makes
ipforme() return Runi for any destination, which can trigger
arp resolution when we attempt to reply. so have v4local()
skip the null address and have sendarp() check the return
status of v4local(), avoing the spurious arp requests.
2018-06-14 00:07:45 +02:00
cinap_lenrek 71ce6f53a4 devip: reject incompatible multicast/interface ip address pairs for ipifcaddmulti() 2018-06-13 18:58:17 +02:00
cinap_lenrek 8fdd633d57 devip: fix missing wunlock() for "ipifc not yet bound to device" case, don't create multicast entry on error 2018-06-12 20:31:39 +02:00
cinap_lenrek 71402b2ea1 devip: fix use after free in ipifcremmulti()
closeconv() calls ipifcremmulti() like:

	while((mp = cv->multi) != nil)
		ipifcremmulti(cv, mp->ma, mp->ia);

so we have to defer freeing the entry after doing:

		if((lifc = iplocalonifc(ifc, ia)) != nil)
			remselfcache(f, ifc, lifc, ma);

which accesses the otherwise free'd ia and ma arguments.
2018-06-11 03:19:42 +02:00
cinap_lenrek 94f6f89ac1 devip: do not icmp reply on multicast destination 2018-06-11 03:14:28 +02:00
cinap_lenrek c9bb6f68eb devip: don't set mtu of interface to zero when not specified (thanks joe9)
change 9f74a951ae6a introduced a bug that set the mtu of a new
interface to 0 when not specified in the add ctl.
2018-05-14 19:18:13 +02:00
cinap_lenrek 298f239695 ip: add some primitive rate limiting knobs to counteract bufferbloat 2018-05-10 19:31:58 +02:00
cinap_lenrek 5aae3d344b devip: improve arp and ndp code
there appears to be confusion about the refresh flag of arpenter().
when we get an arp reply, it makes more sense to just refresh
waiting/existing entries instead creating a new one as we do not
know if we are going to communicate with the remote host in the future.

when we see an arp request for ourselfs however, we want to always
enter the senders address into the arp cache as it is likely the sender
attempts to communicate with us and with the arp entry, we can reply
immidiately.

reject senders from multicast/broadcast mac addresses. thats just silly.

we can get rid of the multicast/broadcast ip checks in ethermedium and
do it in arpenter() instead, checking the route type for the target to
see if its a non unicast target.

enforce strict separation of interface's arp entries by passing a
rlock'd ifc explicitely to arpenter, which we compare against the route
target interface. this makes sure arp/ndp replies only affect entries for
the receiving interface.

handle neighbor solicitation retransmission in nbsendsol() only. that is,
both ethermedium and the rxmitproc just call nbsendsol() which maintains
the timers and counters and handles the rotation on the re-transmission
chain.
2018-04-24 20:21:09 +02:00
cinap_lenrek 20b9326dad devip: fix ipv6 icmp unreachable handling, fix retransmit, fix ifc locking, remove tentative check 2018-04-22 18:54:13 +02:00
cinap_lenrek c80d94304d devip: cleanup ipmux.c 2018-04-22 18:50:45 +02:00
cinap_lenrek 8962551055 devip: increment in counter *AFTER* acquiering the ifc lock or loopbackmedium 2018-04-22 18:50:11 +02:00
cinap_lenrek dbf13129a7 devip: cleanup rudp.c 2018-04-22 18:49:01 +02:00
cinap_lenrek 9860172fce devip: cleanup tcp.c 2018-04-22 18:48:32 +02:00
cinap_lenrek c5c613357e devip: cleanup udp.c 2018-04-22 18:48:08 +02:00
cinap_lenrek 26aca332bb devip: various icmp stuff
no need to rlock ifc in targetttype() as we are called from icmpiput6(),
which the ifc rlocked.

for icmpadvise, the lport, destination *AND* source have to match.

a connection gets a packet when the packets destination matches the source
*OR* the packets source matches the destination.
2018-04-22 18:47:19 +02:00
cinap_lenrek 575398eb9b devip: verify ifcid on routehint check, check Route.ref for free'd routes
v4lookup() and v6lookup() do not acquire the routelock, so it is
possible to hit routes that are on the freelist. to detect these,
we set ref to 0 and check for this case, avoiding overriding the ifc.

re-evaluate routes when the ifcid on the route hint doesnt match.
2018-04-22 18:42:22 +02:00
cinap_lenrek 638b4a1ec1 devip: add "reflect" ctl message, fix memory leaks in icmpv6, fix source address for icmpttlexceeded, cleanup 2018-04-19 01:08:51 +02:00
cinap_lenrek 874701d193 devip: make v4 ifc broadcast and multicast routes specific to address
this allows one to access the same network via multiple interfaces,
the local address then determines which interface is used.
2018-04-11 22:56:25 +02:00
cinap_lenrek 829a451c2b devip: properly initialize the connection ignoreadvice and tos flags 2018-04-10 20:02:03 +02:00
cinap_lenrek c2dd9b1da7 devip: implement source specific routing 2018-04-08 21:15:00 +02:00
cinap_lenrek 547f60b4c5 devip: pick source address for neighbor solicitations as of rfc4861 7.2.2, cleanup
rfc4861 7.2.2:

If the source address of the packet prompting the solicitation is the
same as one of the addresses assigned to the outgoing interface, that
address SHOULD be placed in the IP Source Address of the outgoing
solicitation.

this change adds ndbsendsol() which handles the source address selection
and also handles the arp table locking; avoiding access to the arp entry
after the arp table is unlocked.

cleanups:

- use ipmove() instead of memmove().
- useless extern qualifiers
2018-03-19 01:11:08 +01:00
cinap_lenrek 71f807873b devip: more v6 improvements
ipv4local() and ipv6local() now take remote address argument,
returning the closest local address to the source. this
implements the standartized source address selection rules
instead of just returning the first local v4 or v6 address.

the source address selection was broken for esp, rudp an udp,
blindly assuming ifc->lifc->local being a valid v4 address.
use ipv6local() instead.

the v6 routing code used to lookup source address route to
decide to drop the packet instead of checking the interface
on the destination route.

factor out the route hint from Conv and put it in Routehint
structure. avoiding stack bloat in v4 routing. implement the
same trick for v6 avoiding second route lookup in ipoput6.

fix memory leak in icmpv6 router solicitation handling.

remove old unfinished handling of multiple v6 routers. should
implement source specific routes instead.

avoid duplication, use common convipvers() function.

use isv4() instead of memcmp v4prefix.
2018-03-18 07:50:48 +01:00
cinap_lenrek b2d7992025 kernel: properly handle bad attach specifiers
- only accept decimal for numeric device id's
- exclude negative device id's
- device id's out of range yield Enodev
2018-02-25 17:11:18 +01:00
cinap_lenrek 5560efb3db devip: fix crash on negative dev id on attach 2018-02-25 03:32:29 +01:00
cinap_lenrek 950e22be67 ip: make pkt interfaces unbind on close (from inferno) 2018-01-22 21:33:22 +01:00
cinap_lenrek 9840c50a3e gre: don't drop pptp packets when smaller than v4 header 2018-01-20 15:13:11 +01:00
cinap_lenrek ccf72da47d set router R-flag when sendra is active for neighbor advertisement
windows 7 just drops the default router when it tries to
probe for router reachability but gets a neighbor avertisement
from the router with the router bit clear.

so set the R-flag when sendra is active, which implies that
we are a router.
2018-01-16 20:42:01 +01:00
cinap_lenrek 02b6831fa5 kernel: remove Ipifc.mbps, unused. 2017-12-23 02:58:47 +01:00
cinap_lenrek 35bc3ac573 devether: remove duplicated parseether() implementation (pull from libip) 2017-12-09 22:07:32 +01:00
cinap_lenrek 4aeefba681 kernel: add "close" ctl message for tcp connection to gracefully hang up a connection without a tcp reset (used by go) 2017-01-12 20:04:41 +01:00
cinap_lenrek 78d2a52577 ip/tcp: never raise the mss over the link mtu < 1280 for v6
v6 mandates minimum mtu of 1280, tho someone *could* setup
an interface with a lower mtu or set it lower for testing.
2016-11-16 00:54:04 +01:00
cinap_lenrek 323d625864 ip: get rid of update_mtucache() and restrict_mtu() prototypes 2016-11-15 22:13:08 +01:00
cinap_lenrek 30c5c3404b ip/pktmedium: no mintu, no maclen... thi is ip packets 2016-11-15 22:11:47 +01:00
cinap_lenrek 3579757291 ip/pktmedium: fix wrong hsize, theres no ethernet header on packet media
packet media is just raw ip packets, so theres no link-level
header there. was probably copy-pasted from ethermedium...
2016-11-15 21:54:03 +01:00
cinap_lenrek 1f628ef132 ip/tcp: only calculae mss from interface mtu when directly reachable for v6
we currently do not implement path mtu discovery so for
destinations that are not directly reachable assume the
minimum mtu of 1280 bytes.
2016-11-15 20:28:45 +01:00
cinap_lenrek d97eb114d5 kernel/ip: fix typo (rfc -> ifc) 2016-11-08 22:33:48 +01:00
cinap_lenrek ba38aa8b9d gre: check nil for pullupblock() 2016-11-08 22:33:19 +01:00
cinap_lenrek 99cc56f2e9 kernel/ip: remove nil checks for allocb() and padblock() 2016-11-08 21:05:01 +01:00
cinap_lenrek 857f2528e0 ip: always pass a single block to Medium.bwrite(), avoid concatblock() calls in Dev.bwrite()
the convention for Dev.bwrite() is that it accepts a *single* block,
and not a block chain. so we never have concatblock here.

to keep stuff consistent, we also guarantee thet Medium.bwrite()
will get a *single* block passed as well, as the callers are
few in number.
2016-11-07 22:05:29 +01:00
cinap_lenrek ea993877a9 ip/nullmedium: free passed block in nullbwrite() 2016-11-07 21:40:12 +01:00
cinap_lenrek 59dd0af53a ip/tcp: remove useless nil checks for padblock() and allocb() return value 2016-11-07 21:39:28 +01:00
cinap_lenrek 055f837043 ip: simplify code as packblock() and concatblock() will never error 2016-10-23 00:31:42 +02:00
cinap_lenrek 75c6ab45e0 devip: simplify ipbwrite() by using retun value of qbwrite() 2016-10-23 00:29:41 +02:00
cinap_lenrek ef5c862ce9 ip/icmp: only reply to echo request when directed to us and source is unicast 2016-10-23 00:25:17 +02:00
cinap_lenrek a121806126 kernel: replace various custom random iv buffer filling functions with calls to prng() 2016-09-11 01:54:06 +02:00
cinap_lenrek 7f16c92762 ip/esp: allocate cipher states in secret memory 2016-08-27 20:38:33 +02:00
cinap_lenrek 58a0db935c ip/il: dont attept to connect over IPv6, IL only supports IPv4 packets 2016-08-14 23:07:10 +02:00
cinap_lenrek 66719fb3ea kernel: fix cb->f[0] nil dereferences due to short control request 2016-05-05 18:54:58 +02:00
cinap_lenrek 38a8af2d72 devip: applying changes for bug: multicasts_and_udp_buffers
/n/bugs/open/multicasts_and_udp_buffers
http://bugs.9front.org/open/multicasts_and_udp_buffers/readme

michal@Lnet.pl

I have ported my small MPEG-TS analisis tool to Plan9.

To allow this application working I had to fix a bug in the kernel IPv4 code and increase UDP input buffer.

Bug is related to listening for IPv4 multicast traffic. There is no problem if you listen for only one group or multiple groups with different UDP ports. This works:

Write to UDP ctl:

anounce PORT
addmulti INTERFACE_ADDR MULTICAST_ADDR
headers

and you can read packets from data file.

You need to set headers option because otherwise every UDP packet for MULTICAST_ADDR!PORT is treat as separate connection. This is a bug and should be fixed too, but I didn't tried it.

There is a problem when you need to receive packets for multiple multicast groups. Usually the same destination port is used by multiple streams and above sequence of commands fails for second group because the port is the same.

Simple and probably non-intrusive fix is adding "|| ipismulticast(addr)" to if statement at /sys/src/9/ip/devip.c:861 line:

if(ipforme(c->p->f, addr) || ipismulticast(addr))

This fixes the problem and now you can use the following sequence to listen for multiple multicast groups even if they all have the same destination port:

announce MULTICAST_ADDR!PORT
addmulti INTERFACE_ADDR MULTICAST_ADDR
headers

After that my application started working but signals packet drops at >2 Mb/s input rate. The same is reported by kernel netlog. Increase capacity of UDP connection input queue fixes this problem /sys/src/9/ip/udp.c:153

c->rq = qopen(512*1024, Qmsg, 0, 0);

--
Michał Derkacz
2016-03-28 16:58:09 +02:00
cinap_lenrek 8f2d9a139f devip: handle ignoreadvice flag for all protocols 2016-03-12 23:07:58 +01:00
cinap_lenrek 688c1f15cd fix ipv6 icmphostunr() locking and memory free bugs (from sources) 2016-02-21 16:36:41 +01:00
ftrvxmtrx 668318b2e6 ip/chandial: fail with Ebadarg instead of printing memory contents 2016-02-12 23:52:50 +02:00
cinap_lenrek 772afbe98c format pointer subtraction results with %zd instead of %ld (for long -> intptr on amd64) 2016-01-07 04:44:13 +01:00
cinap_lenrek 8a784a3b9b devip: declare cleanarpent() static 2015-09-27 22:41:38 +02:00
cinap_lenrek 4449a34756 devip: various bugfixes and cleanups for arp code
- fix missing runlock(ifc) when ifcid != a->ifcid in rxmitsols() (thanks erik quanstro)
- don't leak packets when transfering blocks from arp entry hold list to droplist
- free rest of droplist when bwrite() errors in arpenter(), remove useless checks (ifc != nil)
- free arp entry hold list from cleanarpent()
- consistent use of nil for pointers
2015-09-27 22:17:02 +02:00
cinap_lenrek 46926aa502 tcp: fix mtu on server sockets again (thans mycroftix)
for incoming connection, we used s->laddr to lookup the interface
for the incoming call, but this does not work when the announce
address is tcp!*!123, then s->laddr is all zeros "::". instead,
use the incoming destination address for interface mtu lookup.

thanks mycroftix for troubleshooting!
2015-09-02 01:50:55 +02:00
cinap_lenrek e3a64494e7 libsec: remove flawed aes() digest and hmac_aes() implementations (thanks aiju) 2015-09-01 21:35:43 +02:00
cinap_lenrek 94333d83ab ip: fix wrong radix for iphash() (thanks yoann padioleau)
yoann padioleaus report on 9fans:

> I think I’ve found a bug in the network stack.
> in 9/ip/ip.h there is
> struct Ipht
> {
> 	Lock;
> 	Iphash	*tab[Nipht];
> };
>
> where Night is 521,
>
> but then in 9/ip/ipaux.c there is
>
> ulong
> iphash(uchar *sa, ushort sp, uchar *da, ushort dp)
> {
> 	return ((sa[IPaddrlen-1]<<24) ^ (sp << 16) ^ (da[IPaddrlen-1]<<8) ^ dp ) % Nhash;
> }
>
> where Nhash is just 64,
2015-06-09 10:04:04 +02:00
cinap_lenrek 21f97338f8 tcp: fix loopback slowness issue / set tcb->mss for incoming connections (thanks David du Colombier)
David du Colombier wrote:
> The slowness issue only appears on the loopback, because
> it provides a 16384 MTU.
>
> There is an old bug in the Plan 9 TCP stack, were the TCP
> MSS doesn't take account the MTU for incoming connections.
>
> I originally fixed this issue in January 2015 for the Plan 9
> port on Google Compute Engine. On GCE, there is an unusual
> 1460 MTU.
>
> The Plan 9 TCP stack defines a default 1460 MSS corresponding
> to a 1500 MTU. Then, the MSS is fixed according to the MTU
> for outgoing connections, but not incoming connections.
>
> On GCE, this issue leads to IP fragmentation, but GCE didn't
> handle IP fragmentation properly, so the connections
> were dropped.
>
> On the loopback medium, I suppose this is the opposite issue.
> Since the TCP stack didn't fix the MSS in the incoming
> connection, the programs sent multiple small 1500 bytes
> IP packets instead of large 16384 IP packets, but I don't
> know why it leads to such a slowdown.
2015-05-14 21:09:12 +02:00
cinap_lenrek 1db9f19b62 ip: exclude "don't fragment" bit from ipv4 reassembly test
other operating systems always set the "don't fragment" bit
in ther outgoing ipv4 packets causing us to unnecesarily
call ip4reassemble() looking for a fragment reassembly queue.

the change excludes the "don't fragment" bit from the test
so we now call ip4reassemble() only when the "more fragmens"
bit is set or a fragment offset other than zero is given.

this optimization was discovered from akaros.
2014-12-21 17:25:55 +01:00
cinap_lenrek f51f73bdca ip: implement "hangup" ctl for udp protocol 2014-11-13 16:47:19 +01:00
cinap_lenrek 84c40fb226 devip: sanity check Nchan in Fsproto()
devip can only handle Maskconv+1 conversations per
protocol depending on how many bits it uses in the
qid to encode the conversation number.

we check this when the protocol gets registered.

if we do not do this, the kernel will mysteriously
panic when the conversaion numbers collide which
took some time to debug.
2014-09-21 19:24:38 +02:00
cinap_lenrek c145a2c0aa devip: print protocol name in garbage collection notification 2014-09-21 18:02:53 +02:00
cinap_lenrek acb49987e6 ip: set arp entry for own v6 address when not tentative
after running ip/ipconfig -6, we are unable to ping our
own link-local address and the arp daemon sends out useless
neighbor solicitation requests to itself. this change
adds an arp entry for our ipv6 address. however, this
must not be done for tentative interface configuration.
2014-08-26 21:29:37 +02:00
cinap_lenrek 2ec9006e9e ip: fix memory leak in ipicadd6()
allocate the Iplifc structure on the stack instead.
i assuming that it was allocated on heap in fear of
causing stack oveflow. on 386, this adds arround
88 bytes on the stack but it doesnt seem to cause
any trouble. (checked with poolcheck after ctl write)
2014-08-21 00:30:13 +02:00
cinap_lenrek 55bf3d6399 ip: fix missed unlocks and waserror handlers
ipifcunbind() could error out from ipifcremlifc() and Medium.unbind()
*after* decrementing ifc->conv->inuse! move the decrement after
calling these functions.

make ipifcremlifc() never raise error but return error string.
the only places where it could error is when it calls into
medium functions like Medium.remroute() and Medium.remmulti().
Ignore these errors as they could happen when the ethernet driver
crashed (think imported ethernet device or usb ethernet
in userspace), so we will be able to unbind.

add waserror() handlers as neccesary to deal with errors from
Medium.addmulti(), Medium.areg() and arpenter() to properly
unlock the data structures.
2014-08-12 21:35:31 +02:00
cinap_lenrek be3a5a6dc3 kernel: remove Block refcounting (thanks erik) 2014-06-08 00:19:33 +02:00
cinap_lenrek a321204a20 icmp: use snprint, add more unreachable error messages (from erik quanstro) 2014-04-12 18:59:16 +02:00
cinap_lenrek bfbb68a712 ipmux: fix 6c complaints 2014-02-03 20:14:19 +01:00
cinap_lenrek 98f47d5867 kernel: more kproc pexit() and sleep error handling 2013-11-22 22:56:34 +01:00
cinap_lenrek 176569ca4d apply erik quanstros tcp-bdp patch (from sources)
this patch consists of two bits of work submitted as one
patch.

the first bit fixed a "pacing" problem, where a tcp connection
rate-limited by the reading process would experience 10%
of the expected throughput, and could even get into live
lock.  it was noticed at the time of this initial work that
the stack often sent tiny grams.  some good bits from nix'
original tcp were merged in.  the test program
	/n/sources/contrib/quanstro/tcptest.c
will verify that under most conditions, a reader-paced connection
now gets the expected throughput.  expected arguments
would be
	tcptest -s1 -n 5000 -l

the second bit is a first step in preparing tcp to handle
modest (1-2MB) bandwidth-delay products.  the strategy
was to completely implement NewReno.  the testing network
was a 7/35/70ms by 100Mbit wan emulator with 0/.05/.1% loss.
here are the performance comparisons from the changes after
the first round "old" to the submitted patch "new".  the
smallest improvement was 80%, the largest was 11x.

loss%	rtt	old	new
0.10	7	4.40	7.85
0.10	35	0.88	1.79
0.10	70	0.47	0.84
0.05	7	4.80	9.38
0.05	35	1.00	2.02
0.05	70	0.52	1.77
0.01	7	5.33	11.87
0.01	35	1.14	10.97
0.01	70	0.54	4.75
0.00	7	4.49	11.92
0.00	35	1.04	11.35
0.00	70	0.58	10.56

since the diff is not very easy to read, i wrote a small
paper detailing the changes

	http://www.quanstro.net/plan9/tcp/tcp.pdf

- erik
2013-07-21 14:41:51 +02:00
cinap_lenrek bf048d94c3 ip/ethermedium: drop short packets instead of producing negative size blocks
on usb ethernet, it can happen that we read truncated packets smaller
than the ethernet header size. this produces a warning in pullupblock()
later like: "pullup negative length packet, called from 0xf0199e46"
2013-06-17 02:28:10 +02:00
cinap_lenrek d3b727db18 devip: dont raise error() out of Fsprotocone()
Fsprotoclone() is not supposed to raise error, but return nil.
ipopen() seemed to assume otherwise as it setup error label
before calling Fsprotoclone(). fix ipopen(), make Fsprotoclone()
return nil instead of raising error.
2013-05-05 04:28:50 +02:00
cinap_lenrek 9500191af6 devip: handle malloc errors, fix queue leaks
Fsprotocone():

qopen() and qbypass() can fail and return nil, so make sure
the connection was not partially created by checking if read
and write queues have been setup by the protocol create hanler.
on error, free any resources of the partial connection and
error out.

netlogopen(): check malloc() error.
2013-05-05 03:56:11 +02:00
cinap_lenrek 54b62fe493 arp: fix memory leaks for "flush" and "del" arp ctl messages 2013-01-22 15:26:34 +01:00
cinap_lenrek 1159f1e54f ip: fix assert panic on fragmented icmp echo request (see eriks icmp-frag patch) 2012-08-02 02:02:10 +02:00
cinap_lenrek 546ee86c52 tcp: memset paranoia, synced from sources 2012-07-09 21:01:42 +02:00
cinap_lenrek 1de9ca2de5 bring back il protocol support 2012-05-03 10:47:40 +02:00
cinap_lenrek c9f5d14ea6 ip: fix missing poperror (from applied/netlogpoperror) 2012-02-13 05:56:47 +01:00
cinap_lenrek c44b78f739 change definition of Chan.create to return a chan like open 2011-08-17 23:27:31 +02:00
cinap_lenrek bf8fcab00d devip: dont panic when ports get exhausted 2011-07-08 14:39:13 +00:00
Taru Karttunen a9060cc06b Import sources from 2011-03-30 iso image - lib 2011-03-30 19:35:09 +03:00
Taru Karttunen e5888a1ffd Import sources from 2011-03-30 iso image 2011-03-30 15:46:40 +03:00