Commit graph

41 commits

Author SHA1 Message Date
Shannon Booth
b738929195 LibDiff: Fix wrong index used when prepending context lines
`i` is used as the index for 'old lines' in diff generation, not 'new
lines'. Using the wrong index would mean that for certain diffs the
prefixed context information would have wrong content, and could even
result in a crash.

Fix this, and add a test for an input which was previously crashing.
2023-09-11 12:10:50 +01:00
Shannon Booth
dd373eacbc LibDiff+patch: Support multiple patches in a single patch file
Multiple patches may be concatenated in the same patch file, such as git
commits which are changing multiple files at the same time. To handle
this, parse each patch in order in the patch file, and apply each patch
sequentially.

To determine whether we are at the end of a patch (and not just parsing
another hunk) the parser will look for a leading '@@ ' after every hunk.
If that is found, there is another hunk. Otherwise, we must be at the
end of this patch.
2023-07-30 07:47:22 +01:00
Shannon Booth
81df0278b1 patch+LibDiff: Implement 'strip' of filenames when parsing patch
Implement the patch '-p' / '--strip' option, which strips the given
number of leading components from filenames parsed in the patch header.
If not given this option defaults to the basename of that path.
2023-07-29 17:09:09 -06:00
Shannon Booth
2b46e6f664 Everywhere: Update copyrights with my new serenityos.org e-mail :^) 2023-07-15 16:21:29 +02:00
Shannon Booth
828d791a4f LibDiff: Add Diff::apply_patch
Given a set of lines from the file we are patching, and a patch itself,
this function will try and locate where in the file to apply that patch,
and write the result of patching that file (if successful) to the output
stream.
2023-07-13 10:29:30 +01:00
Shannon Booth
efb26b1781 LibDiff: Implement ability to parse a patch header
This is a somewhat naive implementation, but it is enough to parse a
simple unified patch header.

After parsing the patch header, the parser will be at the beginning of
the first hunks range, ready for that hunk to be parsed.
2023-07-13 10:29:30 +01:00
Shannon Booth
ef45221c21 LibDiff: Make parsing of unified hunks more robust
Parsing of the unified hunks now verifies that the expected number of
lines given by the unified location at the beginning of that hunk are
actually in that hunk. Furthermore, we no longer crash when given a
bogus unified range.

As a further benefit, this begins the scaffolding for a patch parser
which should assist us in parsing full patches - even when we are not
aware of the format that a patch has been written in.
2023-07-13 10:29:30 +01:00
Shannon Booth
696c92882a LibDiff: Add a forwarding header 2023-07-06 13:22:37 +02:00
Shannon Booth
44f141dd24 LibDiff: Add Diff::write_context_header
This is used to write a context patch header.
2023-07-03 10:41:30 +02:00
Shannon Booth
f02cf2704c LibDiff: Add support for writing formatted context hunks
There is a little bit more complexity involved here than the other
formats. In particular, this is due to the need to determine whether
an addition line or removal line is just that, or a 'change'.
2023-07-03 10:41:30 +02:00
Shannon Booth
55a3dfec10 LibDiff: Add support for generating diffs with surrounding context
While not used in normal diffs due to limitations in the format, this
may be used in context and unified format diffs.
2023-07-02 11:18:11 -06:00
Shannon Booth
f528aedc85 LibDiff: Add Diff::write_unified_header
This is used to write a unified patch header.
2023-07-02 11:18:11 -06:00
Shannon Booth
a4e50deeea LibDiff: Add Diff::write_unified for formatting unified hunks 2023-07-02 11:18:11 -06:00
Shannon Booth
f690807c5a LibDiff: Change underlying representation of Hunk to allow context
The existing hunk data structure does not contain any way to easily
store information about context surrounding the additions and removals
in a hunk. While this does work fine for normal diffs (where there is
never any surrounding context) this data structure is quite limiting for
other use cases.

Without support for surrounding context it is not possible to:
 * Add support for unified or context format to the diff utility to
   output surrounding context.
 * Be able to implement a patch utility that uses the surrounding
   context to reliably locate where to apply a patch when a hunk range
   does not apply perfectly.

This patch changes Diff::Hunk such that its data structure more closely
resembles a unified diff. Each line in a hunk is now either a change,
removal, addition or context.

Allowing hunks to have context inside of them exposes that HackStudio
heavily relies on there being no context in the hunks that it uses for
its' git gutter implementation. The fix here is simple - ask git to
produce us a diff that has no context in it!
2023-07-02 11:18:11 -06:00
Shannon Booth
d3a27eeaf2 LibDiff: Port Diff::Hunk from DeprecatedString to String
Removing the last instance of DeprecatedString in this library! :^)
2023-06-26 19:26:34 +02:00
Shannon Booth
9dc92f19b8 LibDiff: Port Diff::parse_hunks from DeprecatedString to StringView 2023-06-26 19:26:34 +02:00
Shannon Booth
ee92378b80 LibDiff: Make Diff::from_text fallible 2023-06-26 19:26:34 +02:00
Shannon Booth
23df5748f6 LibDiff: Make Diff::parse_hunks fallible
Currently the only error that can happen is an OOM. However, in the
future there may be other errors that this function may throw, such as
detecting an invalid patch.
2023-06-26 19:26:34 +02:00
Shannon Booth
dbd838efdf LibDiff: Replace DeprecatedString with StringView in parse_hunk_location 2023-06-26 19:26:34 +02:00
Shannon Booth
7afff80e71 LibDiff: Add Diff::write_normal for outputting normal hunks
In order to extract duplicated code between browser and the diff
utility.
2023-06-24 18:34:08 +02:00
Karol Kosek
697d6ffb1d LibDiff: Make Diff::generate_only_additions take text as StringView 2022-12-20 10:58:54 +01:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Tim Schumacher
7834e26ddb Everywhere: Explicitly link all binaries against the LibC target
Even though the toolchain implicitly links against -lc, it does not know
where it should get LibC from except for the sysroot. In the case of
Clang this causes it to pick up the LibC stub instead, which might be
slightly outdated and feature missing symbols.

This is currently not an issue that manifests because we pass through
the dependency on LibC and other libraries by accident, which causes
CMake to link against the LibC target (instead of just the library),
and thus points the linker at the build output directory.

Since we are looking to fix that in the upcoming commits, let's make
sure that everything will still be able to find the proper LibC first.
2022-11-01 14:49:09 +00:00
demostanis
3e8b5ac920 AK+Everywhere: Turn bool keep_empty to an enum in split* functions 2022-10-24 23:29:18 +01:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Daniel Bertalan
4b4177f39c LibDiff: Generate hunks for new/deleted files
Previously we would fail to generate any hunks if the old or the new
file was empty. We now do, with the original/target line index set to 0,
as specified by POSIX.
2022-03-08 23:30:47 +01:00
Itamar
4737fc2485 LibDiff: Flush leftover hunks at the end
This change makes sure that we arrive at the end of the "DP-matrix" and
flush any leftover hunks to get a complete diff result.
2022-02-09 00:51:31 +01:00
Conor Byrne
faad7a3ed1 LibDiff: Fix error when parsing a 'new' hunk location
If the location started at 0, and / or the length was 0, it would
originally turn out to be a location of { -1, -1 } when LibDiff was
finished parsing, which was incorrect.

To fix this, we only subtract 1 if `start` or `length` isn't 0.
2021-12-31 14:12:54 +01:00
Andreas Kling
8b1108e485 Everywhere: Pass AK::StringView by value 2021-11-11 01:27:46 +01:00
Andreas Kling
5f7d008791 AK+Everywhere: Stop including Vector.h from StringView.h
Preparation for using Error.h from Vector.h. This required moving some
things out of line.
2021-11-10 21:58:58 +01:00
Mustafa Quraish
6f423ed26e LibDiff: Coalesce adjacent changes into the same Hunk
Now we keep track of the "current" hunk, and only create a new one
if there's at least a single unmodified lines between changes.
2021-09-24 14:32:52 +02:00
Mustafa Quraish
35704ba272 LibDiff: Perform diffing-algorithm in reverse order
Previously the algorithm was being performed from the start of the
string to the end, which was a little more convenient when writing
the code, but made it more annoying to be able to properly talk
about the "start" of where the changes were happening, since we
can only re-construct the changes in reverse order of the initial
traversal.

Basically, doing the initial pass in reverse lets us reconstruct
the hunks in the correct order to begin with, and not have to worry
about reversing the hunks / lines within the hunks
2021-09-24 14:32:52 +02:00
Mustafa Quraish
5e28da1aa4 LibDiff: Add new API to generate hunks from two pieces of text
For now this is just a standard implementation of the longest
common subsequence algorithm over the lines, except that it doesn't
do any coalescing of the lines. This isn't really ideal since
we get a single Hunk per changed line, and is definitely something
to improve in the future.
2021-09-17 16:56:59 +00:00
Andreas Kling
de395a3df2 AK+Everywhere: Consolidate String::index_of() and String::find()
We had two functions for doing mostly the same thing. Combine both
of them into String::find() and use that everywhere.

Also add some tests to cover basic behavior.
2021-05-24 11:59:18 +02:00
Andreas Kling
5b87276841 LibDiff: Convert StringBuilder::appendf() => AK::Format 2021-05-07 21:12:09 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
asynts
8465683dcf Everywhere: Debug macros instead of constexpr.
This was done with the following script:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/dbgln<debug_([a-z_]+)>/dbgln<\U\1_DEBUG>/' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/if constexpr \(debug_([a-z0-9_]+)/if constexpr \(\U\1_DEBUG/' {} \;
2021-01-25 09:47:36 +01:00
asynts
9229ba0fe9 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-22 22:14:30 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00