We previously stored the entire ASTNode vector in each parser state,
and this vector was copied whenever a state was loaded or saved.
We don't actually need to store the whole nodes list in each state
because a new state can only add new nodes to this list, and won't
mutate existing nodes.
It would suffice to only hold a vector of the nodes that were created
while parsing in the current state to keep a reference to them.
This reduces the time it takes on my machine for the c++ language
server to handle a file that #includes <LibGUI/Widget.h> from ~4sec to
~0.7sec.
There's no need to store parser error messages for states with
depth > 0, as they will eventually be popped from the states stack and
their error messages will never be displayed to the user.
Profiling shows that this change reduces the % of backtraces that
contain the store_state & load_state functions from ~95% to ~70%.
Empirically this change reduces the time it takes on my machine for the
c++ language server to handle a file that #includes <LibGUI/Widget.h>
from ~14sec to ~4sec.
Previously almost all fields were public and were directly accessed by
the Parser and CppComprehensionEngine.
This commit makes all fields of AST node types private. They are now
accessed via getters & setters.
This adds a new ASTNode type called 'NamedType' which inherits from
the Type node.
Previously every Type node had a name field, but it was not logically
accurate. For example, pointer types do not have a name
(the pointed-to type may have one).
We can now handle access-specifier tags (for example 'private:') when
parsing class declarations.
We currently only consume these tags on move on. We'll need to add some
logic that accounts for the access level of symbols down the road.
Previously, we had a special ASTNode for class members,
"MemberDeclaration", which only represented fields.
This commit removes MemberDeclaration and instead uses regular
Declaration nodes for representing the members of a class.
This means that we can now also parse methods, inner-classes, and other
declarations that appear inside of a class.
And use them to highlight javascript in HTML source.
This commit also changes how TextDocumentSpan::data is interpreted,
as it used to be an opaque pointer, but everyone stuffed an enum value
inside it, which made the values not unique to each highlighter;
that field is now a u64 serial id.
The syntax highlighters don't need to change their ways of stuffing
token types into that field, but a highlighter that calls another
nested highlighter needs to register the nested types for use with
token pairs.
This changes the C++ SyntaxHighlighter to conform to the now-fixed
rendering of syntax highlighting spans in GUI::TextEditor.
Contrary to other syntax highlighters, for this one the change has been
made to the SyntaxHighlighter rather than the Lexer. This is due to the
fact that the Parser also uses the same Lexer. I'm soure there is some
more elegant way to do this, but this patch at least unbreaks the C++
syntax highlighting.
If an include statement didn't contain whitespace between the word
"include" and the '<' or '"', the lexer would previous emit an empty
whitespace token. This has been changed now.
For each .cpp file in the test suite data, there is a .ast file that
represents the "known good" baseline of the parser result.
Each .cpp file goes through the parser, and the result of
invoking `ASTNode::dump()` on the root node is compared to the
baseline to find regressions.
We also check that there were no parser errors when parsing the .cpp
files.
Previously, ASTNode::dump() used outln() for output, which meant it
always wrote its output to stdout.
After this commit, ASTNode::dump() receives an 'output' argument (which
is stdout by default). This enables writing the output to somewhere
else.
This will be useful for testing the LibCpp Parser with the output of
ASTNode::dump.
After this commit, Parser::index_of_node_at will prefer to return nodes
with greater indices.
Since the parsing logic ensures that child nodes come after parent
nodes, this change makes this function return child nodes when possible.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *