Commit graph

138 commits

Author SHA1 Message Date
Itamar
f91974677c LibCpp: Use lex_iterable() where applicable 2021-08-21 22:09:56 +02:00
Itamar
606e05852f LibCpp: Add lex_iterable() method to the Lexer
This allows us to collect the tokens iteratively instead of having to
lex the whole program and then get a tokens vector.
2021-08-21 22:09:56 +02:00
Itamar
7a4a32b112 LibCpp: Lex before processing the source in the Preprocessor
Previously, the preprocessor first split the source into lines, and then
processed and lexed each line separately.

This patch makes the preprocessor first lex the source, and then do the
processing on the tokenized representation.

This generally simplifies the code, and also fixes an issue we
previously had with multiline comments (we did not recognize them
correctly when processing each line separately).
2021-08-21 22:09:56 +02:00
Itamar
165a0082c4 LibCpp: Allow whitespace between # and preprocessor directive
For example, '#    include <stdio.h>' is now supported by the Lexer.
2021-08-21 22:09:56 +02:00
Itamar
e57fdb63f8 Tests: Add regression tests for the LibCpp preprocessor
Similarly to the LibCpp parser regression tests, these tests run the
preprocessor on the .cpp test files under
Userland/LibCpp/Tests/preprocessor, and compare the output with existing
.txt ground truth files.
2021-08-14 12:40:55 +02:00
Itamar
a38c330c68 LibCpp: Move parser tests to Userland/Libraries/LibCpp/Tests/parser 2021-08-14 12:40:55 +02:00
Itamar
f6c9071f0d LibCpp: Evaluate function-like macro calls 2021-08-14 12:40:55 +02:00
Itamar
8505fcb8ae LibCpp: Understand preprocessor macro definition and invocation
The preprocessor now understands when a function-like macro is defined,
and can also parse calls to such macros.

The actual evaluation of function-like macros will be done in a
separate commit.
2021-08-14 12:40:55 +02:00
Itamar
c7d3a7789c LibCpp: Add lexer option to ignore whitespace tokens 2021-08-14 12:40:55 +02:00
Itamar
9da9398bf0 LibCpp: Do macro substitution in the preprocessor instead of the parser
After this change, the parser is completely separated from preprocessor
concepts.
2021-08-07 21:24:11 +02:00
Itamar
0c4dc00f01 LibCpp: Import definitions from headers while processing
When the preprocessor encounters an #include statement it now adds
the preprocessor definitions that exist in the included header to its
own set of definitions.

We previously only aggregated the definitions from headers after
processing the source, which was less correct. (For example, there
could be an #ifdef that depends on a definition from another header).
2021-08-07 21:24:11 +02:00
Itamar
4673a517f6 LibCpp: Do lexing in the Preprocessor
We now call Preprocessor::process_and_lex() and pass the result to the
parser.

Doing the lexing in the preprocessor will allow us to maintain the
original position information of tokens after substituting definitions.
2021-08-07 21:24:11 +02:00
Itamar
bf7262681e LibCpp: Support initializing the lexer with a "start line" 2021-08-07 21:24:11 +02:00
Ali Mohammad Pur
f16011e4d1 LibCpp: Allow 'final' in a class declaration with inheritance 2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
010be01694 LibCpp: Add support for east const
Now LibCpp can understand the eastest of consts too :^)
2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
e27ec04cdd LibCpp: Allow 'override' as a function target qualifier
This is just ignored right now.
2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
5f66874ea0 LibCpp: Add support for parsing function types
This makes it work with types like `Function<T(U, V)>`.
2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
b3cbe14569 LibCpp: Allow 'const' after a function's signature
This is too lax for functions that aren't class members, but let's
allow that anyway.
2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
3319114127 LibCpp: Add support for parsing reference types 2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
3c1422d774 LibCpp: Allow virtual destructors 2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
c866a56f07 LibCpp: Match and ignore struct/class inheritance 2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
dc68c765b7 LibCpp: Correctly parse lines that end in '\'
Such lines should be considered to be joined into the next line.
This makes multiline preprocessor stuff "work".
2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
8fefbfd5ac LibCpp: Parse enum members with explicit values 2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
67a19eaecb LibCpp: Parse "extern" declarations
Note that this is not the `extern "C"` declarations, just extern decl
qualifiers.
2021-08-02 01:03:59 +02:00
Ali Mohammad Pur
5d27740387 LibCpp: Accept scoped variable declarations
For instance, `Type Scope::Class::variable = value;` is a valid
declaration.
2021-08-02 01:03:59 +02:00
Itamar
42eb06f045 LibCpp: Don't store entire ASTNode vector in each parser state
We previously stored the entire ASTNode vector in each parser state,
and this vector was copied whenever a state was loaded or saved.

We don't actually need to store the whole nodes list in each state
because a new state can only add new nodes to this list, and won't
mutate existing nodes.

It would suffice to only hold a vector of the nodes that were created
while parsing in the current state to keep a reference to them.

This reduces the time it takes on my machine for the c++ language
server to handle a file that #includes <LibGUI/Widget.h> from ~4sec to
~0.7sec.
2021-07-13 23:20:09 +02:00
Itamar
eb6a15d52b LibCpp: Only store error messages for the main parser state
There's no need to store parser error messages for states with
depth > 0, as they will eventually be popped from the states stack and
their error messages will never be displayed to the user.

Profiling shows that this change reduces the % of backtraces that
contain the store_state & load_state functions from ~95% to ~70%.

Empirically this change reduces the time it takes on my machine for the
c++ language server to handle a file that #includes <LibGUI/Widget.h>
from ~14sec to ~4sec.
2021-07-13 23:20:09 +02:00
Itamar
b5a02b180c LibCpp: Use fast_is<T> and verify_cast<T> to replace C-style casts
Thanks to @alimpfard for suggesting this :)
2021-07-10 21:58:28 +02:00
Itamar
34fc6c7e1c LibCpp: Make the fields of AST node types private
Previously almost all fields were public and were directly accessed by
the Parser and CppComprehensionEngine.

This commit makes all fields of AST node types private. They are now
accessed via getters & setters.
2021-07-10 21:58:28 +02:00
Itamar
232013c05b LibCpp: Add Parser::tokens_in_range(start, end)
This function returns the tokens that exist in the specified range.
2021-07-04 17:50:33 +02:00
Itamar
9a31fb6673 LibCpp: Fix positional information of Pointer types 2021-07-04 17:50:33 +02:00
Itamar
1dfdfcf820 LibCpp: Fix parsing of ellipsis
Previously the positional information for the node of an ellipsis was
incorrect.
2021-07-04 17:50:33 +02:00
Itamar
4123be7639 LibCpp: Update Parser test data after Type=>NamedType change 2021-06-29 00:07:19 +04:30
Itamar
d7aa831a43 LibCpp: Differentiate between Type and NamedType
This adds a new ASTNode type called 'NamedType' which inherits from
the Type node.

Previously every Type node had a name field, but it was not logically
accurate. For example, pointer types do not have a name
(the pointed-to type may have one).
2021-06-29 00:07:19 +04:30
Itamar
10cad8a874 LibCpp: Add LOG_SCOPE() macro for debugging the parser's flow
LOG_SCOPE() uses ScopeLogger and additionally shows the current token
in the parser's state.
2021-06-29 00:07:19 +04:30
Itamar
c1ee0c1685 LibCpp: Support parsing enum classes 2021-06-29 00:07:19 +04:30
Federico Guerinoni
e0f1c237d2 HackStudio: Make TODO entries clickable
Now you can click a TODO entry to set focus on that position of that
file.
2021-06-23 19:00:11 +01:00
Federico Guerinoni
c397e030f4 LibCpp: Add function for retrieving TODO comments from the parser
Now `get_todo_entries` collects all TODO found within a comment
statement.
2021-06-23 19:00:11 +01:00
Brian Gianforcaro
6c114ecaef LibCpp: Remove InlineLinkedList from the list of known types 2021-06-16 10:40:01 +02:00
Andreas Kling
dc65f54c06 AK: Rename Vector::append(Vector) => Vector::extend(Vector)
Let's make it a bit more clear when we're appending the elements from
one vector to the end of another vector.
2021-06-12 13:24:45 +02:00
Itamar
ee9fe288b2 LibCpp: Add test for parsing class definitions 2021-06-09 22:26:46 +02:00
Itamar
7de6c1489b LibCpp: Parse basic constructors and destructors 2021-06-09 22:26:46 +02:00
Itamar
fd851ec5c9 LibCpp: Handle class access-specifiers in the Parser
We can now handle access-specifier tags (for example 'private:') when
parsing class declarations.

We currently only consume these tags on move on. We'll need to add some
logic that accounts for the access level of symbols down the road.
2021-06-09 22:26:46 +02:00
Itamar
dcdb0c7035 LibCpp: Support non-field class members
Previously, we had a special ASTNode for class members,
"MemberDeclaration", which only represented fields.

This commit removes MemberDeclaration and instead uses regular
Declaration nodes for representing the members of a class.

This means that we can now also parse methods, inner-classes, and other
declarations that appear inside of a class.
2021-06-09 22:26:46 +02:00
Itamar
8f074222e8 LibCpp: Make 'bool' a Token::Type::KnownType 2021-06-09 22:26:46 +02:00
Ali Mohammad Pur
71b4433b0d LibWeb+LibSyntax: Implement nested syntax highlighters
And use them to highlight javascript in HTML source.
This commit also changes how TextDocumentSpan::data is interpreted,
as it used to be an opaque pointer, but everyone stuffed an enum value
inside it, which made the values not unique to each highlighter;
that field is now a u64 serial id.
The syntax highlighters don't need to change their ways of stuffing
token types into that field, but a highlighter that calls another
nested highlighter needs to register the nested types for use with
token pairs.
2021-06-07 14:45:49 +04:30
Max Wipfli
ec2f0fc8eb LibCpp: Fix off-by-one error in SyntaxHighlighter
This changes the C++ SyntaxHighlighter to conform to the now-fixed
rendering of syntax highlighting spans in GUI::TextEditor.

Contrary to other syntax highlighters, for this one the change has been
made to the SyntaxHighlighter rather than the Lexer. This is due to the
fact that the Parser also uses the same Lexer. I'm soure there is some
more elegant way to do this, but this patch at least unbreaks the C++
syntax highlighting.
2021-06-05 00:32:28 +04:30
Max Wipfli
617c54a00b LibCpp: Do not emit empty whitespace token after include statement
If an include statement didn't contain whitespace between the word
"include" and the '<' or '"', the lexer would previous emit an empty
whitespace token. This has been changed now.
2021-06-05 00:32:28 +04:30
Max Wipfli
2aa0cbaf22 LibCpp: Use CharacterTypes.h and constexpr functions in Lexer 2021-06-05 00:32:28 +04:30
Max Wipfli
d57d7bea1c LibCpp: Use east const style in Lexer and SyntaxHighlighter 2021-06-05 00:32:28 +04:30