Skip to content

Commit

Permalink
LibWeb: Fix off by one error in HTML Tokenizer
Browse files Browse the repository at this point in the history
In 'NamedCharacterReference' we attempt to lookup the code point by a
identifier, eg apos; becomes '

This is done by passing the entire rest of the document to the
`HTML::code_points_from_entity` function.

However, before this change we didn't sent the final character which
meant if the document ended in a named character reference the lookup
would fail.
  • Loading branch information
ant1441 authored and awesomekling committed Feb 21, 2022
1 parent c6dd8a1 commit c6fcdd0
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion Userland/Libraries/LibWeb/HTML/Parser/HTMLTokenizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1617,7 +1617,7 @@ Optional<HTMLToken> HTMLTokenizer::next_token()
{
size_t byte_offset = m_utf8_view.byte_offset_of(m_prev_utf8_iterator);

auto match = HTML::code_points_from_entity(m_decoded_input.substring_view(byte_offset, m_decoded_input.length() - byte_offset - 1));
auto match = HTML::code_points_from_entity(m_decoded_input.substring_view(byte_offset, m_decoded_input.length() - byte_offset));

if (match.has_value()) {
skip(match->entity.length() - 1);
Expand Down

0 comments on commit c6fcdd0

Please sign in to comment.