瀏覽代碼

LibTextCodec: Change UTF-8's decoder to replace invalid code points

The UTF-8 decoder will currently crash if it is provided invalid UTF-8
input. Instead, change its behavior to match that of all other decoders
to replace invalid code points with U+FFFD. This is required by the web.
Timothy Flynn 2 年之前
父節點
當前提交
00fa23237a
共有 1 個文件被更改,包括 6 次插入4 次删除
  1. 6 4
      Userland/Libraries/LibTextCodec/Decoder.cpp

+ 6 - 4
Userland/Libraries/LibTextCodec/Decoder.cpp

@@ -213,10 +213,12 @@ ErrorOr<String> convert_input_to_utf8_using_given_decoder_unless_there_is_a_byte
 
 
     VERIFY(actual_decoder);
     VERIFY(actual_decoder);
 
 
-    // FIXME: 3. Process a queue with an instance of encoding’s decoder, ioQueue, output, and "replacement".
-    //        This isn't the exact same as the spec, especially the error mode of "replacement", which we don't have the concept of yet.
+    // 3. Process a queue with an instance of encoding’s decoder, ioQueue, output, and "replacement".
+    // FIXME: This isn't the exact same as the spec, which is written in terms of I/O queues.
+    auto output = TRY(actual_decoder->to_utf8(input));
+
     // 4. Return output.
     // 4. Return output.
-    return actual_decoder->to_utf8(input);
+    return output;
 }
 }
 
 
 ErrorOr<String> Decoder::to_utf8(StringView input)
 ErrorOr<String> Decoder::to_utf8(StringView input)
@@ -242,7 +244,7 @@ ErrorOr<String> UTF8Decoder::to_utf8(StringView input)
         bomless_input = input.substring_view(3);
         bomless_input = input.substring_view(3);
     }
     }
 
 
-    return String::from_utf8(bomless_input);
+    return Decoder::to_utf8(bomless_input);
 }
 }
 
 
 ErrorOr<void> UTF16BEDecoder::process(StringView input, Function<ErrorOr<void>(u32)> on_code_point)
 ErrorOr<void> UTF16BEDecoder::process(StringView input, Function<ErrorOr<void>(u32)> on_code_point)