Skip to content

Commit

Permalink
LibWeb: Fix off-by-one in HTMLTokenizer::restore_to()
Browse files Browse the repository at this point in the history
The difference should be between m_utf8_iterator and the
the new position, if m_prev_utf8_iterator is used one fewer
source position is popped than required.

This issue was not apparent on most pages since restore_to
used for tokens such  <!doctype> that are normally
followed by a newline that resets the column to zero,
but it can be seen on pages with minified HTML.
  • Loading branch information
MacDue authored and linusg committed Feb 13, 2022
1 parent 62ad33a commit b193351
Showing 1 changed file with 7 additions and 9 deletions.
16 changes: 7 additions & 9 deletions Userland/Libraries/LibWeb/HTML/Parser/HTMLTokenizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2726,15 +2726,13 @@ bool HTMLTokenizer::consumed_as_part_of_an_attribute() const

void HTMLTokenizer::restore_to(Utf8CodePointIterator const& new_iterator)
{
if (new_iterator != m_prev_utf8_iterator) {
auto diff = m_prev_utf8_iterator - new_iterator;
if (diff > 0) {
for (ssize_t i = 0; i < diff; ++i)
m_source_positions.take_last();
} else {
// Going forwards...?
TODO();
}
auto diff = m_utf8_iterator - new_iterator;
if (diff > 0) {
for (ssize_t i = 0; i < diff; ++i)
m_source_positions.take_last();
} else {
// Going forwards...?
TODO();
}
m_utf8_iterator = new_iterator;
}
Expand Down

0 comments on commit b193351

Please sign in to comment.