Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace rpmfi(les) hardlink discovery + lookup with STL containers #3112

Merged
merged 2 commits into from
May 21, 2024

Conversation

pmatilai
Copy link
Member

This was a fairly tricky one, details in commits.

This is a fairly tricksy one, the old structure had a manual refcount
doubling up as the number of file links, and the file index array
as a flexible array at the end of the struct. Replace the manual
reference count with a shared_ptr smart pointer that gets deleted
along with the vector itself, which gets deleted along with the
nlink hash itself. Couple of typedefs to tame the rather verbose
types...
This was another tricky one due to various things: the behavior here
depends on an undocumented rpmhash implementation detail, namely that
multiple values per key are preserved in the insertion order. This is
not true for unordered_multimap, the order is implementation defined.
Also unlike rpmhash, unordered_multimap does not have a method for
retrieving the number of keys, so whether hardlinks were discovered
needs to be tracked differently.

We avoid both of these problems by realizing that the arrays generated
in the second step are exactly the same as we calculated in the first
round already. So we collect the indexes to the smart pointer vectors
in the discovery stage already, utilizing .emplace() to avoid unnecessary
instantiation/destruction or extra lookup. With that, we know there are
hardlinks in the file set if any key has more than one index associated.
A vector obviously keeps its order when pushing back to it, and finally
we save a round of data structure copying when we just transfer the
relevant ones to the file index keyed hash we use for hardlink lookups
elsewhere, and compiler takes care of all the bookkeeping.
@pmatilai pmatilai merged commit 080fa89 into rpm-software-management:master May 21, 2024
1 check passed
@pmatilai pmatilai deleted the cxx-hardlink branch May 21, 2024 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant