Sfoglia il codice sorgente

LibWeb: Make ON_WHITESPACE less heavy in HTML tokenizer

Once we know that the current code point is an ASCII character, we can
just check if it's one of the HTML whitespace characters.

Before this patch, we were using the generic StringView::contains(u32)
path that splats a code point into a StringBuilder and then searches
for it with memmem().

This reduces time spent in the HTML tokenizer from 16% to 6% when
loading the ECMA-262 spec.
Andreas Kling 2 anni fa
parent
commit
c79e8aab0a

+ 1 - 1
Userland/Libraries/LibWeb/HTML/Parser/HTMLTokenizer.cpp

@@ -121,7 +121,7 @@ namespace Web::HTML {
     if (current_input_character.has_value() && is_ascii_hex_digit(current_input_character.value()))
     if (current_input_character.has_value() && is_ascii_hex_digit(current_input_character.value()))
 
 
 #define ON_WHITESPACE \
 #define ON_WHITESPACE \
-    if (current_input_character.has_value() && is_ascii(current_input_character.value()) && "\t\n\f "sv.contains(current_input_character.value()))
+    if (current_input_character.has_value() && is_ascii(*current_input_character) && first_is_one_of(static_cast<char>(*current_input_character), '\t', '\n', '\f', ' '))
 
 
 #define ANYTHING_ELSE if (1)
 #define ANYTHING_ELSE if (1)