When calling ioctl on a socket with SIOCGIFHWADDR, return the correct
physical interface type. This value was previously hardcoded to
ARPHRD_ETHER (Ethernet), and now can also return ARPHRD_LOOPBACK for the
loopback adapter.
Instead of using a clunky if-statement paradigm, we now have all drivers
being declaring two methods for their adapter class - create and probe.
These methods are linked in each PCINetworkDriverInitializer structure,
in a new s_initializers static list of them.
Then, when we probe for a PCI device, we use each probe method and if
there's a match, then the corresponding create method is called. After
the adapter instance is created, we call the virtual initialize method
on it, because many drivers actually require a sort of post-construction
initialization sequence to ensure the network adapter can properly
function.
As a result of this change, it's much more easy to add more drivers and
the initialization code is more readable and it's easier to understand
when and where things could fail in the whole initialization sequence.
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
obtain any lock rank just via the template parameter. It was not
previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
wanted in the first place (or made use of). Locks randomly changing
their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
taken in the right order (with the right lock rank checking
implementation) as rank information is fully statically known.
This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:
- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable
This patch renames the Kernel classes so that they can coexist with
the original AK classes:
- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable
The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
It doesn't make sense after introduction of routing table which allows
having multiple gateways for every interface, and isn't used by any of
the userspace programs now.
Read the appropriate registers for RTL8139, RTL8168 and E1000.
For NE2000 just assume 10mbit full duplex as there is no indicator
for it in the pure NE2000 spec. Mock values for loopback.
Instead of initializing network adapters in init.cpp, let's move that
logic into a separate class to handle this.
Also, it seems like a good idea to shift responsiblity on enumeration
of network adapters after the boot process, so this singleton will take
care of finding the appropriate network adapter when asked to with an
IPv4 address or interface name.
With this change being merged, we simplify the creation logic of
NetworkAdapter derived classes, so we enumerate the PCI bus only once,
searching for driver candidates when doing so, and we let each driver
to test if it is resposible for the specified PCI device.
Previously we'd allocate buffers when sending packets. This patch
avoids these allocations by using the NetworkAdapter's packet queue.
At the same time this also avoids copying partially constructed
packets in order to prepend Ethernet and/or IPv4 headers. It also
properly truncates UDP and raw IP packets.
There's no good reason to distinguish between network interfaces based
on their model. It's probably a good idea to try keep the names more
persistent so scripts written for a specific network interface will be
useable after hotplug event (or after rebooting with new hardware
setup).
This avoids two allocations when receiving network packets. One for
inserting a PacketWithTimestamp into m_packet_queue and another one
when inserting buffers into the list of unused packet buffers.
With this fixed the only allocations in NetworkTask happen when
initially allocating the PacketWithTimestamp structs and when switching
contexts.
This avoids allocations for initializing the Function<T>
for the NetworkAdapter::for_each callback argument.
Applying this patch decreases CPU utilization for NetworkTask
from 40% to 28% when receiving TCP packets at a rate of 100Mbit/s.
The last IP address in an IPv4 subnet is considered the directed
broadcast address, e.g. for 192.168.3.0/24 the directed broadcast
address is 192.168.3.255. We need to consider this address as
belonging to the interface.
Here's an example with this fix applied, SerenityOS has 192.168.3.190:
[gunnar@nyx ~]$ ping -b 192.168.3.255
WARNING: pinging broadcast address
PING 192.168.3.255 (192.168.3.255) 56(84) bytes of data.
64 bytes from 192.168.3.175: icmp_seq=1 ttl=64 time=0.950 ms
64 bytes from 192.168.3.188: icmp_seq=1 ttl=64 time=2.33 ms
64 bytes from 192.168.3.46: icmp_seq=1 ttl=64 time=2.77 ms
64 bytes from 192.168.3.41: icmp_seq=1 ttl=64 time=4.15 ms
64 bytes from 192.168.3.190: icmp_seq=1 ttl=64 time=29.4 ms
64 bytes from 192.168.3.42: icmp_seq=1 ttl=64 time=30.8 ms
64 bytes from 192.168.3.55: icmp_seq=1 ttl=64 time=31.0 ms
64 bytes from 192.168.3.30: icmp_seq=1 ttl=64 time=33.2 ms
64 bytes from 192.168.3.31: icmp_seq=1 ttl=64 time=33.2 ms
64 bytes from 192.168.3.173: icmp_seq=1 ttl=64 time=41.7 ms
64 bytes from 192.168.3.43: icmp_seq=1 ttl=64 time=47.7 ms
^C
--- 192.168.3.255 ping statistics ---
1 packets transmitted, 1 received, +10 duplicates, 0% packet loss,
time 0ms, rtt min/avg/max/mdev = 0.950/23.376/47.676/16.539 ms
[gunnar@nyx ~]$
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Since the receiving socket isn't yet known at packet receive time,
keep timestamps for all packets.
This is useful for keeping statistics about in-kernel queue latencies
in the future, and it can be used to implement SO_TIMESTAMP.
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
The idea behind WeakPtr<NetworkAdapter> was to support hot-pluggable
network adapters, but on closer thought, that's super impractical so
let's not go down that road.
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
The majority of the time in NetworkTask was being spent in allocating
and deallocating KBuffers for each incoming packet.
We'll now keep up to 100 buffers around and reuse them for new packets
if the next incoming packet fits in an old buffer. This is pretty
naively implemented but definitely cuts down on time spent here.
This defaults to 1500 for all adapters, but LoopbackAdapter increases
it to 65536 on construction.
If an IPv4 packet is larger than the MTU, we'll need to break it into
smaller fragments before transmitting it. This part is a FIXME. :^)
Made getsockopt() and setsockopt() virtual so we can handle them in the
various Socket subclasses. The subclasses map kinda nicely to "levels".
This will allow us to implement things like "traceroute", although..
I spent some time trying to do that, but then hit a wall when it turned
out that the user-mode networking in QEMU doesn't preserve TTL in the
ICMP packets passing through.