LibWeb: Fix off by one error in HTML Tokenizer

In 'NamedCharacterReference' we attempt to lookup the code point by a identifier, eg apos; becomes ' This is done by passing the entire rest of the document to the `HTML::code_points_from_entity` function. However, before this change we didn't sent the final character which meant if the document ended in a named character reference the lookup would fail.
orospakr · Feb 21, 2022 · c6fcdd0 · c6fcdd0
1 parent c6dd8a1
commit c6fcdd0
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/Userland/Libraries/LibWeb/HTML/Parser/HTMLTokenizer.cpp b/Userland/Libraries/LibWeb/HTML/Parser/HTMLTokenizer.cpp
@@ -1617,7 +1617,7 @@ Optional<HTMLToken> HTMLTokenizer::next_token()
  {
  size_t byte_offset = m_utf8_view.byte_offset_of(m_prev_utf8_iterator);
 
- auto match = HTML::code_points_from_entity(m_decoded_input.substring_view(byte_offset, m_decoded_input.length() - byte_offset - 1));
+ auto match = HTML::code_points_from_entity(m_decoded_input.substring_view(byte_offset, m_decoded_input.length() - byte_offset));
 
  if (match.has_value()) {
  skip(match->entity.length() - 1);