Commit graph

217 commits

Author SHA1 Message Date
Jelle Raaijmakers
f391ccfe53 LibGfx+Everywhere: Change Gfx::Rect to be endpoint exclusive
Previously, calling `.right()` on a `Gfx::Rect` would return the last
column's coordinate still inside the rectangle, or `left + width - 1`.
This is called 'endpoint inclusive' and does not make a lot of sense for
`Gfx::Rect<float>` where a rectangle of width 5 at position (0, 0) would
return 4 as its right side. This same problem exists for `.bottom()`.

This changes `Gfx::Rect` to be endpoint exclusive, which gives us the
nice property that `width = right - left` and `height = bottom - top`.
It enables us to treat `Gfx::Rect<int>` and `Gfx::Rect<float>` exactly
the same.

All users of `Gfx::Rect` have been updated accordingly.
2023-05-23 12:35:42 +02:00
Jelle Raaijmakers
62285e0569 LibSoftGPU: Use multiplication instead of division for linear fog
Sampling profiling shows a reduction of nearly 60% for the linear fog
calculation case.
2023-02-18 01:45:00 +01:00
Jelle Raaijmakers
f54d9c0a61 LibSoftGPU: Use AK::SIMD::exp_approximate instead of ::exp
The approximate version is properly vectorized and results in fewer
stalls than the `::exp` version.
2023-02-18 01:45:00 +01:00
Jelle Raaijmakers
62f4486190 LibSoftGPU: Only enable texture stages if required
Copying over every texel (4x`f32x4`) for every texture unit is
relatively expensive. By checking if we even need to remember these
texel values, we reduce the time spent in `rasterize_triangle` by
around 2% as measured in Quake III.
2023-02-02 14:38:26 +01:00
Jelle Raaijmakers
69b94e4235 LibSoftGPU: Make blending simpler and more efficient
Previously, we would precalculate "alpha blend factors" on every
configuration update and then calculate the source and destination
blending factors in one go using all these factors. The idea here was
probably that we would get better performance by avoiding branching.

However, by measuring blending performance in Quake III, it seems that
this simpler version that only calculates the required factors reduces
the CPU time spent in `rasterize_triangle` by 3%.

As a bonus, `GL_SRC_ALPHA_SATURATE` is now also implemented.
2023-02-02 14:38:26 +01:00
Jelle Raaijmakers
5c1038e54f LibSoftGPU: Remove DeprecatedString usage 2023-01-30 13:49:52 -05:00
Linus Groh
9c08bb9555 AK: Remove try_ prefix from FixedArray creation functions 2023-01-28 22:41:36 +01:00
Tim Schumacher
82a152b696 LibGfx: Remove try_ prefix from bitmap creation functions
Those don't have any non-try counterpart, so we might as well just omit
it.
2023-01-26 20:24:37 +00:00
Jelle Raaijmakers
44d679ba7e LibSoftGPU: Remove workaround for i686 depth comparison 2023-01-09 12:55:41 +01:00
Andrew Kaster
a492e2018d Userland: Silence warnings from ElapsedTimer::elapsed() type change
We changed elapsed() to return i64 instead of int as that's what
AK::Time::to_milliseconds() returns, causing a bunch of implicit lossy
conversions in callers. Clean those up with a mix of type changes and
casts.
2023-01-07 14:51:04 +01:00
Stephan Unverwerth
3b2ded1d44 LibGPU+LibSoftGPU: Move size and pixel format information to GPU::Image
Size and format information are the same for every implementation and do
not need to be virtual. This removes the need to reimplement them for
each driver.
2022-12-26 09:39:20 +01:00
Stephan Unverwerth
c25359df47 LibSoftGPU: Delegate shader creation to new class ShaderCompiler 2022-12-17 22:39:09 -07:00
Stephan Unverwerth
b18bf702ea LibSoftGPU: Implement shader processor for SoftGPU ISA
This adds a shader processor that executes our ISA when a fragment
shader is currently bound to the device.
2022-12-17 22:39:09 -07:00
Stephan Unverwerth
1e548a84d6 LibSoftGPU: Define a simple shader instruction set
This adds a simple instruction set with basic operations and adds an
instruction list to the shader class.
2022-12-17 22:39:09 -07:00
Stephan Unverwerth
bb28492af0 LibSoftGPU: Make output in PixelQuad generic
Same as with inputs, we define outputs as a generic array of floats.
This can later be expanded to accomodate multiple render targets or
vertex attributes in the case of a vertex shader.
2022-12-17 22:39:09 -07:00
Stephan Unverwerth
c008b6ce18 LibSoftGPU: Make input in PixelQuad generic
Previously we would store vertex color and texture coordinates in
separate fields in PixelQuad. To make them accessible from shaders we
need to store them as a completely generic array of floats.
2022-12-17 22:39:09 -07:00
Stephan Unverwerth
49139d5f4e LibSoftGPU: Allow binding a fragment shader 2022-12-17 22:39:09 -07:00
Stephan Unverwerth
93ab2db80f LibGL+LibSoftGPU: Add GPU side shader infrastructure
This adds a shader class to LibSoftGPU and makes use of it when linking
GLSL program in LibGL. Also adds actual rendering code to the shader
tests.
2022-12-17 22:39:09 -07:00
MacDue
27fae78335 Meta+Userland: Pass Gfx::IntSize by value
Just two ints like Gfx::IntPoint.
2022-12-07 11:48:27 +01:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Linus Groh
d26aabff04 Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Tim Schumacher
ce2f1b845f Everywhere: Mark dependencies of most targets as PRIVATE
Otherwise, we end up propagating those dependencies into targets that
link against that library, which creates unnecessary link-time
dependencies.

Also included are changes to readd now missing dependencies to tools
that actually need them.
2022-11-01 14:49:09 +00:00
Jelle Raaijmakers
1c32d93a12 LibSoftGPU: Call floor_int_range only once in sample_2d_lod
We were invoking `frac_int_range` twice to get the `alpha` and `beta`
values to interpolate between 4 texels, but these call into
`floor_int_range` again. Let's not repeat the work.
2022-10-19 22:22:58 +02:00
Jelle Raaijmakers
88ca72aa79 LibSoftGPU: Extract argb32_color value in rasterization
This makes it easier to correlate slow instructions in the disassembly
view of ProfileViewer.
2022-10-19 22:22:58 +02:00
Jelle Raaijmakers
681695a07a LibSoftGPU: Make alpha testing a static function
There is no need to access the Device's members for alpha testing; pass
in the required alpha function and reference value.
2022-10-19 22:22:58 +02:00
Jelle Raaijmakers
4e63ce231f LibSoftGPU: Clean up Sampler imports 2022-10-19 22:22:58 +02:00
Jelle Raaijmakers
1774fde37c LibSoftGPU: Drop texel Z coordinate from Sampler
We only support 2D indexing into textures at the moment, so don't
perform any work trying to support the Z coordinate.
2022-10-19 22:22:58 +02:00
cflip
abc0c44f0b LibGL+LibGPU+LibSoftGPU: Report maximum texture size 2022-10-19 22:07:05 +02:00
Ben Wiederhake
a99cd09891 Libraries: Add missing includes, add namespace qualifiers
This remained undetected for a long time as HeaderCheck is disabled by
default. This commit makes the following file compile again:

    // file: compile_me.cpp
    #include <LibDNS/Question.h>
    // That's it, this was enough to cause a compilation error.

Likewise for most other files touched by this commit.
2022-09-18 13:27:24 -04:00
Tim Schumacher
ef9b543426 LibC: Remove the LibM interface target 2022-09-16 16:09:19 +00:00
Jelle Raaijmakers
8ff7c52cf4 LibSoftGPU: Return a const& texel in Image to prevent copying
On every texel access, some floating point instructions involved in
copying 4 floats popped up. Let `Image::texel() const` return a
`FloatVector4 const&` to prevent these operations.

This results in a ~7% FPS increase in GLQuake on my machine.
2022-09-14 17:17:36 +02:00
Jelle Raaijmakers
e9d2f9a95e LibSoftGPU: Use memcpy instead of a loop to blit the color buffer
Looking at `Tubes` before and after this change, comparing the original
loop to the one using `memcpy`, including the time for `memcpy` itself,
resulted in ~15% fewer samples in profiles on my machine.
2022-09-14 17:17:19 +02:00
Jelle Raaijmakers
6dcc808994 LibSoftGPU: Reduce subpixel precision from 6 to 4 bits
With 6 bits of precision, the maximum triangle coordinate we can
handle is sqrt(2^31 / (1 << 6)^2) = ~724. Rendering to a target of
800x600 or higher quickly becomes a mess because of integer overflow.

By reducing the subpixel precision to 4 bits, we support coordinates up
to ~2896, which means that we can (try to) render to target sizes like
2560x1440.

This fixes the main menu backdrop for the Half-Life port. It also
introduces more white pixel artifacts in Quake's water / lava
rendering, but this is a level geometry visualization bug (see
`r_novis`).
2022-09-13 20:20:03 +02:00
Jelle Raaijmakers
eda1ffba73 LibGL: Implement GL_TEXTURE_LOD_BIAS for texture objects 2022-09-13 20:20:03 +02:00
Jelle Raaijmakers
1aa1c89afa LibGL+LibGPU+LibSoftGPU: Report texture clamp to edge support 2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
087f565700 LibSoftGPU: Divide texture coordinates by Q
Up until now, we have only dealt with games that pass Q = 1 for their
texture coordinates. PrBoom+, however, relies on proper homogenous
texture coordinates for its relatively complex sky rendering, which
means that we should perform this per-fragment division.
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
00d46e5d77 LibGL+LibGPU+LibSoftGPU: Implement matrix stack per texture unit
Each texture unit now has its own texture transformation matrix stack.
Introduce a new texture unit configuration that is synced when changed.
Because we're no longer passing a silly `Vector` when drawing each
primitive, this results in a slightly improved frames per second :^)
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
1540c56e6c LibGL+LibGPU+LibSoftGPU: Implement GL_GENERATE_MIPMAP
We can now generate texture mipmaps on the fly if the client requests
it. This fixes the missing textures in our PrBoom+ port.
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
dda5987684 LibGL+LibGPU+LibSoftGPU: Remove concept of layer in favor of depth
Looking at how Khronos defines layers:

  https://www.khronos.org/opengl/wiki/Array_Texture

We both have 3D textures and layers of 2D textures, which can both be
encoded in our existing `Typed3DBuffer` as depth. Since we support
depth already in the GPU API, remove layer everywhere.

Also pass in `Texture2D::LOG2_MAX_TEXTURE_SIZE` as the maximum number
of mipmap levels, so we do not allocate 999 levels on each Image
instantiation.
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
44953a4301 LibGL+LibGPU+LibSoftGPU: Implement glCopyTex(Sub)?Image2d
These two methods copy from the frame buffer to (part of) a texture.
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
eb81b66b4e LibGL+LibGPU+LibSoftGPU: Rename blit_color_buffer_to
This makes it consistent with our other `blit_from_color_buffer` and
paves the way for a third method that will be introduced in one of the
next commits.
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
1d36bfdac1 LibGL+LibSoftGPU: Implement fixed pipeline support for GL_COMBINE
`GL_COMBINE` is basically a fixed function calculator to perform simple
arithmetics on configurable fragment sources. This patch implements a
number of texture env parameters with support for the RGBA internal
format.
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
759ef82e75 LibSoftGPU: Convert width and height to f32x4 just once
We were passing along a `u32x4` only for it to be converted to `f32x4`
as soon as we'd use it.
2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
b62dba6bbf LibSoftGPU: Remove unused alias truncate_int_range 2022-09-11 22:37:07 +01:00
Jelle Raaijmakers
b42feb76a0 LibSoftGPU: Use approximation for maximum depth slope
OpenGL allows GPUs to approximate a triangle's maximum depth slope
which prevents a number computationally expensive instructions. On my
machine, this gives me +6% FPS in Quake III.

We are able to reuse `render_bounds` here since it is the containing
rect of the (X, Y) window coordinates of the triangle, thus its width
and height are the maximum delta X and delta Y, respectively.
2022-09-08 12:07:03 -04:00
Jelle Raaijmakers
94f016b363 LibGL+LibGPU+LibSoftGPU: Report texture env add extension
The Quake 3 port makes use of this extension to determine a more
efficient multitexturing strategy. Since LibSoftGPU supports it, let's
report the extension in LibGL. :^)
2022-08-28 23:45:43 +01:00
Jelle Raaijmakers
84c4b66721 LibGL+LibGPU+LibSoftGPU: Implement texture pixel format support
In OpenGL this is called the (base) internal format which is an
expectation expressed by the client for the minimum supported texel
storage format in the GPU for textures.

Since we store everything as RGBA in a `FloatVector4`, the only thing
we do in this patch is remember the expected internal format, and when
we write new texels we fixate the value for the alpha channel to 1 for
two formats that require it.

`PixelConverter` has learned how to transform pixels during transfer to
support this.
2022-08-27 12:28:05 +02:00
Jelle Raaijmakers
6c80d12111 LibGPU+LibSoftGPU: Add PixelFormat::Intensity 2022-08-27 12:28:05 +02:00
Jelle Raaijmakers
eb7c3d16fb LibGL+LibGPU+LibSoftGPU: Implement flexible pixel format conversion
A GPU (driver) is now responsible for reading and writing pixels from
and to user data. The client (LibGL) is responsible for specifying how
the user data must be interpreted or written to.

This allows us to centralize all pixel format conversion in one class,
`LibSoftGPU::PixelConverter`. For both the input and output image, it
takes a specification containing the image dimensions, the pixel type
and the selection (basically a clipping rect), and converts the pixels
from the input image to the output image.

Effectively this means we now support almost all OpenGL 1.5 formats,
and all custom logic has disappeared from:
  - `glDrawPixels`
  - `glReadPixels`
  - `glTexImage2D`
  - `glTexSubImage2D`

The new logic is still unoptimized, but on my machine I experienced no
noticeable slowdown. :^)
2022-08-27 12:28:05 +02:00