Before doing a check if offset_in_region + num_bytes of the transfer
descriptor are together more than NUM_TRANSFER_REGION_PAGES * PAGE_SIZE,
check that addition of both of these parameters will not simply overflow
which could lead to out-of-bounds read/write.
Fixes#17518.
Dealing with the specific details of how to program a PLL should be done
in a separate file to ensure we can easily expand it to support future
generations of the Intel graphics device.
ENODEV better represents the fact that there might be no display device
(e.g. a monitor) connected to the connector, therefore we should return
this error.
Another reason to not use ENOTIMPL is that it's a requirement for all
DisplayConnectors to put a valid EDID in place even for a hardware we
don't currently support mode-setting in runtime.
Instead of doing that on the IntelDisplayPlane class, let's have this in
derived classes so these classes can decide how to use the settings that
were provided before calling the enable method.
It became apparent to me that future generations of the Intel graphics
chipset utilize the same register set as part of the Transcoder register
set. Therefore, it should be included now in the Transcoder class.
In the real world, graphics hardware tend to have multiple display
connectors. However, usually the connectors share one register space but
still keeping different PLL timings and display lanes.
This new class should represent a group of multiple display connectors
working together in the same Intel graphics adapter. This opens an
opportunity to abstract the interface so we could support future Intel
iGPU generations.
This is also a preparation before the driver can support newer devices
and utilize their capabilities.
The mentioned preparation is applied in a these aspects:
1. The code is splitted into more classes to adjust to future expansion.
2 classes are introduced: IntelDisplayPlane and IntelDisplayTranscoder,
so the IntelDisplayPlane controls the plane registers and second class
controls the pipeline (transcoder, encoder) registers. On gen4 it's not
really useful because there are probably one plane and one encoder to
care about, but in future generations, there are likely to be multiple
transcoders and planes to accommodate multi head support.
2. The set_edid_bytes method in the DisplayConnector class can now be
told to not assume the provided EDID bytes are always invalid. Therefore
it can refrain from printing error messages if this flag parameter is
true. This is useful for supporting real hardware situation when on boot
not all ports are connected to a monitor, which can result in floating
bus condition (essentially all the bytes we read are 0xFF).
3. An IntelNativeDisplayConnector could now be set to flag other types
of connections such as eDP (embedded DisplayPort), Analog output, etc.
This is important because on the Intel gen4 graphics we could assume to
have one analog output connector, but on future generations this is very
likely to not be the case, as there might be no VGA outputs, but rather
only an eDP connector which is converted to VGA by a design choice of
the motherboard manufacturer.
4. Add ConnectorIndex to IntelNativeDisplayConnector class - Currently
this is used to verify we always handle the correct connector when doing
modesetting.
Later, it will be used to locate special settings needed when handling
connector requests.
5. Prepare to support more types of display planes. For example, the
Intel Skylake register set for display planes is a bit different, so
let's ensure we can properly support it in the near future.
Returning literal strings is not the proper action here, because we
should always assume that error could be propagated back to userland, so
we need to keep a valid errno when returning an Error.
Splitting the I2C-related code lets the DisplayConnector code to utilize
I2C operations without caring about the specific details of the hardware
and allow future expansion of the driver to other newer generations
sharing the same GMBus code.
We should require a timeout for GMBus operations always, because faulty
hardware could let us just spin forever. Also, if nothing is listening
to the bus (which should result in a NAK), we could also spin forever.
Thanks to Andrew Kaster, which gave a review back in October, about a
big PR I opened (#15502), I managed to figure out why we always had a
problem with the first byte being read into the EDID buffer with the
GMBus code. It turns out that this simple invalid cast was making the
entire problem and using the correct AK::Array::data() method fixed this
notorious long standing problem for good.
There are now 2 separate classes for almost the same object type:
- EnumerableDeviceIdentifier, which is used in the enumeration code for
all PCI host controller classes. This is allowed to be moved and
copied, as it doesn't support ref-counting.
- DeviceIdentifier, which inherits from EnumerableDeviceIdentifier. This
class uses ref-counting, and is not allowed to be copied. It has a
spinlock member in its structure to allow safely executing complicated
IO sequences on a PCI device and its space configuration.
There's a static method that allows a quick conversion from
EnumerableDeviceIdentifier to DeviceIdentifier while creating a
NonnullRefPtr out of it.
The reason for doing this is for the sake of integrity and reliablity of
the system in 2 places:
- Ensure that "complicated" tasks that rely on manipulating PCI device
registers are done in a safe manner. For example, determining a PCI
BAR space size requires multiple read and writes to the same register,
and if another CPU tries to do something else with our selected
register, then the result will be a catastrophe.
- Allow the PCI API to have a united form around a shared object which
actually holds much more data than the PCI::Address structure. This is
fundamental if we want to do certain types of optimizations, and be
able to support more features of the PCI bus in the foreseeable
future.
This patch already has several implications:
- All PCI::Device(s) hold a reference to a DeviceIdentifier structure
being given originally from the PCI::Access singleton. This means that
all instances of DeviceIdentifier structures are located in one place,
and all references are pointing to that location. This ensures that
locking the operation spinlock will take effect in all the appropriate
places.
- We no longer support adding PCI host controllers and then immediately
allow for enumerating it with a lambda function. It was found that
this method is extremely broken and too much complicated to work
reliably with the new paradigm being introduced in this patch. This
means that for Volume Management Devices (Intel VMD devices), we
simply first enumerate the PCI bus for such devices in the storage
code, and if we find a device, we attach it in the PCI::Access method
which will scan for devices behind that bridge and will add new
DeviceIdentifier(s) objects to its internal Vector. Afterwards, we
just continue as usual with scanning for actual storage controllers,
so we will find a corresponding NVMe controllers if there were any
behind that VMD bridge.
This header has always been fundamentally a Kernel API file. Move it
where it belongs. Include it directly in Kernel files, and make
Userland applications include it via sys/ioctl.h rather than directly.
Instead of using a clunky switch-case paradigm, we now have all drivers
being declaring two methods for their adapter class - create and probe.
These methods are linked in each PCIGraphicsDriverInitializer structure,
in a new s_initializers static list of them.
Then, when we probe for a PCI device, we use each probe method and if
there's a match, then the corresponding create method is called.
As a result of this change, it's much more easy to add more drivers and
the initialization code is more readable.
We try our best to ensure a DisplayConnector initialization succeeds,
and this makes the Intel driver to work again, because if we can't
allocate a Region for the whole PCI BAR mapped region, then we will try
to allocate a Region with 16 MiB window size, so it doesn't eat the
entire Kernel-allocated virtual memory space.
Instead of just returning nothing, let's return Error or nothing.
This would help later on with error propagation in case of failure
during this method.
This also makes us more paranoid about failure in this method, so when
initializing a DisplayConnector we safely tear down the internal members
of the object. This applies the same for a StorageDevice object, but its
after_inserting method is much smaller compared to the DisplayConnector
overriden method.
A virtual method named device_name() was added to
Kernel::PCI to support logging the PCI::Device name
and address using dmesgln_pci. Previously, PCI::Device
did not store the device name.
All devices inheriting from PCI::Device now use dmesgln_pci where
they previously used dmesgln.
These instances were detected by searching for files that include
Kernel/Debug.h, but don't match the regex:
\\bdbgln_if\(|_DEBUG\\b
This regex is pessimistic, so there might be more files that don't check
for any real *_DEBUG macro. There seem to be no corner cases anyway.
In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
obtain any lock rank just via the template parameter. It was not
previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
wanted in the first place (or made use of). Locks randomly changing
their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
taken in the right order (with the right lock rank checking
implementation) as rank information is fully statically known.
This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
This has been done in multiple ways:
- Each time we modeset the resolution via the VirtIOGPU DisplayConnector
we ensure that the framebuffer is updated with the new resolution.
- Each time the cursor is updated we ensure that the framebuffer console
is marked dirty so the IO Work Queue task which is scheduled to check
if it is dirty, will flush the surface.
- We only initialize a framebuffer console after we ensure that at the
very least a DisplayConnector has being set with a known resolution.
- We only call GenericFramebufferConsole::enable() when enabling the
console after the important variables of the console (m_width, m_pitch
and m_height) have been set.
This is necessary to allow transferring frame buffers larger than
~500x500 pixels back to user space. Until the buffer management is
improved this allows us to at least test the existing game ports.
This happens to be a sad truth for the VirtIOGPU driver - it lacked any
error propagation measures and generally relied on clunky assumptions
that most operations with the GPU device are infallible, although in
reality much of them could fail, so we do need to handle errors.
To fix this, synchronous GPU commands no longer rely on the wait queue
mechanism anymore, so instead we introduce a timeout-based mechanism,
similar to how other Kernel drivers use a polling based mechanism with
the assumption that hardware could get stuck in an error state and we
could abort gracefully.
Then, we change most of the VirtIOGraphicsAdapter methods to propagate
errors properly to the original callers, to ensure that if a synchronous
GPU command failed, either the Kernel or userspace could do something
meaningful about this situation.
The performance that we achieve from this technique is visually worse
compared to turning off this feature, so let's not use this until we
figure out why it happens.
As is, we never *deallocate* them, so we will run out eventually.
Creating a context, or allocating a context ID, now returns ErrorOr if
there are no available free context IDs.
`number_of_fixmes--;` :^)
The BootFramebufferConsole class maps the framebuffer using the
MemoryManager, so to be able to draw the logo, we need to get this
mapped framebuffer. This commit adds a unsafe API for that.
The MemoryManager now works, so we can use the same code as on x86 to
map the framebuffer. Since it uses the MemoryManager, the initialization
of the BootFramebufferConsole has to happen after the MemoryManager is
working.
This value will be used later on by WindowServer to reject resolutions
that will request a mapping that will overflow the hardware framebuffer
max length.
This class is intended to replace all IOAddress usages in the Kernel
codebase altogether. The idea is to ensure IO can be done in
arch-specific manner that is determined mostly in compile-time, but to
still be able to use most of the Kernel code in non-x86 builds. Specific
devices that rely on x86-specific IO instructions are already placed in
the Arch/x86 directory and are omitted for non-x86 builds.
The reason this works so well is the fact that x86 IO space acts in a
similar fashion to the traditional memory space being available in most
CPU architectures - the x86 IO space is essentially just an array of
bytes like the physical memory address space, but requires x86 IO
instructions to load and store data. Therefore, many devices allow host
software to interact with the hardware registers in both ways, with a
noticeable trend even in the modern x86 hardware to move away from the
old x86 IO space to exclusively using memory-mapped IO.
Therefore, the IOWindow class encapsulates both methods for x86 builds.
The idea is to allow PCI devices to be used in either way in x86 builds,
so when trying to map an IOWindow on a PCI BAR, the Kernel will try to
find the proper method being declared with the PCI BAR flags.
For old PCI hardware on non-x86 builds this might turn into a problem as
we can't use port mapped IO, so the Kernel will gracefully fail with
ENOTSUP error code if that's the case, as there's really nothing we can
do within such case.
For general IO, the read{8,16,32} and write{8,16,32} methods are
available as a convenient API for other places in the Kernel. There are
simply no direct 64-bit IO API methods yet, as it's not needed right now
and is not considered to be Arch-agnostic too - the x86 IO space doesn't
support generating 64 bit cycle on IO bus and instead requires two 2
32-bit accesses. If for whatever reason it appears to be necessary to do
IO in such manner, it could probably be added with some neat tricks to
do so. It is recommended to use Memory::TypedMapping struct if direct 64
bit IO is actually needed.
The new VGAIOArbiter class is now responsible to conduct x86-specific
instructions to control VGA hardware from the old ISA ports. This allows
us to ensure the GraphicsManagement code doesn't use x86-specific code,
thus allowing it to be compiled within non-x86 kernel builds.
The BootFramebufferConsole highly depends on using the m_lock spinlock,
therefore setting and changing the cursor state should be done under
that spinlock too to avoid crashing.
Only the Console code in the Graphics directory should be able to call
on these methods. The set_cursor method stays public as VirtualConsole
uses that method to change the cursor position.
This device is supposed to be used in microvm and ISA-PC machine types,
and we assume that if we are able to probe for the QEMU BGA version of
0xB0C5, then we have an existing ISA Bochs VGA adapter to utilize.
To ensure we don't instantiate the driver for non isa-vga devices, we
try to ensure that PCI is disabled because hardware IO test probe failed
so we can be sure that we use this special handling code only in the
QEMU microvm and ISA-PC machine types. Unfortunately, this means that if
for some reason the isa-vga device is attached for the i440FX or Q35
machine types, we simply are not able to drive the device in such setups
at all.
To determine the amount of VRAM being available, we read VBE register at
offset 0xA. That register holds the amount of VRAM divided by 64K, so we
need to multiply the value in our code to use the actual VRAM size value
again.
The isa-vga device requires us to hardcode the framebuffer physical
address to 0xE0000000, and that address is not expected to change in the
future as many other projects rely on the isa-vga framebuffer to be
present at that physical memory address.
We use a ScopeGuard to ensure we always set a console of some sort if we
exit early from the initialization sequence in the GraphicsManagement
code. We do so to ensure we can boot into text mode console in an ISA-PC
machine type, because earlier we failed with an assertion due to not
setting any console for VirtualConsole to use.
The AHCI code doesn't rely on x86 IO at all as it only uses memory
mapped IO so we can simply remove the header.
We also simply don't use x86 IO in the Intel graphics driver, so we can
simply remove the include of the x86 IO header there too.
Everything else was a bunch of stale includes to the x86 IO header and
are actually not necessary, so let's remove them to make it easier to
compile non-x86 Kernel builds.