LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
/*
|
2023-02-19 21:07:52 +00:00
|
|
|
* Copyright (c) 2022-2023, Andreas Kling <kling@serenityos.org>
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
*
|
|
|
|
* SPDX-License-Identifier: BSD-2-Clause
|
|
|
|
*/
|
|
|
|
|
2022-11-24 13:16:56 +00:00
|
|
|
#include <AK/BinarySearch.h>
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
#include <AK/Utf8View.h>
|
|
|
|
#include <LibJS/SourceCode.h>
|
|
|
|
#include <LibJS/SourceRange.h>
|
|
|
|
#include <LibJS/Token.h>
|
|
|
|
|
|
|
|
namespace JS {
|
|
|
|
|
2023-02-19 21:07:52 +00:00
|
|
|
NonnullRefPtr<SourceCode const> SourceCode::create(String filename, String code)
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
{
|
|
|
|
return adopt_ref(*new SourceCode(move(filename), move(code)));
|
|
|
|
}
|
|
|
|
|
2023-01-26 13:33:18 +00:00
|
|
|
SourceCode::SourceCode(String filename, String code)
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
: m_filename(move(filename))
|
|
|
|
, m_code(move(code))
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2023-01-26 13:33:18 +00:00
|
|
|
String const& SourceCode::filename() const
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
{
|
|
|
|
return m_filename;
|
|
|
|
}
|
|
|
|
|
2023-01-26 13:33:18 +00:00
|
|
|
String const& SourceCode::code() const
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
{
|
|
|
|
return m_code;
|
|
|
|
}
|
|
|
|
|
2023-09-12 11:03:56 +00:00
|
|
|
void SourceCode::fill_position_cache() const
|
2022-11-24 13:16:56 +00:00
|
|
|
{
|
2023-09-12 11:03:56 +00:00
|
|
|
constexpr size_t minimum_distance_between_cached_positions = 10000;
|
2022-11-24 13:16:56 +00:00
|
|
|
|
|
|
|
if (m_code.is_empty())
|
|
|
|
return;
|
|
|
|
|
|
|
|
bool previous_code_point_was_carriage_return = false;
|
2023-09-12 11:03:56 +00:00
|
|
|
size_t line = 1;
|
|
|
|
size_t column = 1;
|
|
|
|
size_t offset_of_last_starting_point = 0;
|
|
|
|
m_cached_positions.ensure_capacity(m_code.bytes().size() / minimum_distance_between_cached_positions);
|
|
|
|
m_cached_positions.append({ .line = 1, .column = 1, .offset = 0 });
|
|
|
|
|
|
|
|
Utf8View const view(m_code);
|
2022-11-24 13:16:56 +00:00
|
|
|
for (auto it = view.begin(); it != view.end(); ++it) {
|
|
|
|
u32 code_point = *it;
|
|
|
|
bool is_line_terminator = code_point == '\r' || (code_point == '\n' && !previous_code_point_was_carriage_return) || code_point == LINE_SEPARATOR || code_point == PARAGRAPH_SEPARATOR;
|
|
|
|
previous_code_point_was_carriage_return = code_point == '\r';
|
|
|
|
|
2023-09-12 11:03:56 +00:00
|
|
|
auto byte_offset = view.byte_offset_of(it);
|
|
|
|
if ((byte_offset - offset_of_last_starting_point) >= minimum_distance_between_cached_positions) {
|
|
|
|
m_cached_positions.append({ .line = line, .column = column, .offset = byte_offset });
|
|
|
|
offset_of_last_starting_point = byte_offset;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (is_line_terminator) {
|
|
|
|
line += 1;
|
|
|
|
column = 1;
|
|
|
|
} else {
|
|
|
|
column += 1;
|
|
|
|
}
|
2022-11-24 13:16:56 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
SourceRange SourceCode::range_from_offsets(u32 start_offset, u32 end_offset) const
|
|
|
|
{
|
2022-11-24 17:01:52 +00:00
|
|
|
// If the underlying code is an empty string, the range is 1,1 - 1,1 no matter what.
|
2022-11-24 13:16:56 +00:00
|
|
|
if (m_code.is_empty())
|
2022-11-24 17:01:52 +00:00
|
|
|
return { *this, { .line = 1, .column = 1, .offset = 0 }, { .line = 1, .column = 1, .offset = 0 } };
|
2022-11-24 13:16:56 +00:00
|
|
|
|
2023-09-12 11:03:56 +00:00
|
|
|
if (m_cached_positions.is_empty())
|
|
|
|
fill_position_cache();
|
2022-11-24 13:16:56 +00:00
|
|
|
|
2023-09-12 11:03:56 +00:00
|
|
|
Position current { .line = 1, .column = 1, .offset = 0 };
|
2022-11-24 13:16:56 +00:00
|
|
|
|
2023-09-12 11:03:56 +00:00
|
|
|
if (!m_cached_positions.is_empty()) {
|
|
|
|
Position const dummy;
|
|
|
|
size_t nearest_index = 0;
|
|
|
|
binary_search(m_cached_positions, dummy, &nearest_index,
|
|
|
|
[&](auto&, auto& starting_point) {
|
|
|
|
return start_offset - starting_point.offset;
|
|
|
|
});
|
|
|
|
|
|
|
|
current = m_cached_positions[nearest_index];
|
2022-11-24 13:16:56 +00:00
|
|
|
}
|
|
|
|
|
2022-11-24 17:01:52 +00:00
|
|
|
Optional<Position> start;
|
|
|
|
Optional<Position> end;
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
|
|
|
|
bool previous_code_point_was_carriage_return = false;
|
|
|
|
|
2023-09-12 11:03:56 +00:00
|
|
|
Utf8View const view(m_code);
|
|
|
|
for (auto it = view.iterator_at_byte_offset_without_validation(current.offset); it != view.end(); ++it) {
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
|
2022-11-24 17:01:52 +00:00
|
|
|
// If we're on or after the start offset, this is the start position.
|
|
|
|
if (!start.has_value() && view.byte_offset_of(it) >= start_offset) {
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
start = Position {
|
2023-09-12 11:03:56 +00:00
|
|
|
.line = current.line,
|
|
|
|
.column = current.column,
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
.offset = start_offset,
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2022-11-24 17:01:52 +00:00
|
|
|
// If we're on or after the end offset, this is the end position.
|
|
|
|
if (!end.has_value() && view.byte_offset_of(it) >= end_offset) {
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
end = Position {
|
2023-09-12 11:03:56 +00:00
|
|
|
.line = current.line,
|
|
|
|
.column = current.column,
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
.offset = end_offset,
|
|
|
|
};
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
u32 code_point = *it;
|
|
|
|
|
2023-09-12 11:03:56 +00:00
|
|
|
bool const is_line_terminator = code_point == '\r' || (code_point == '\n' && !previous_code_point_was_carriage_return) || code_point == LINE_SEPARATOR || code_point == PARAGRAPH_SEPARATOR;
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
previous_code_point_was_carriage_return = code_point == '\r';
|
|
|
|
|
|
|
|
if (is_line_terminator) {
|
2023-09-12 11:03:56 +00:00
|
|
|
current.line += 1;
|
|
|
|
current.column = 1;
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
continue;
|
|
|
|
}
|
2023-09-12 11:03:56 +00:00
|
|
|
current.column += 1;
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
}
|
|
|
|
|
2022-11-24 17:01:52 +00:00
|
|
|
// If we didn't find both a start and end position, just return 1,1-1,1.
|
|
|
|
// FIXME: This is a hack. Find a way to return the nicest possible values here.
|
|
|
|
if (!start.has_value() || !end.has_value())
|
|
|
|
return SourceRange { *this, { .line = 1, .column = 1 }, { .line = 1, .column = 1 } };
|
|
|
|
|
|
|
|
return SourceRange { *this, *start, *end };
|
LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-21 16:37:38 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
}
|