Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection may not be performed often enough #27

Closed
amberin opened this issue Oct 14, 2023 · 1 comment · Fixed by #260
Closed

Garbage collection may not be performed often enough #27

amberin opened this issue Oct 14, 2023 · 1 comment · Fixed by #260
Labels
bug Something isn't working Git repos Related to Git syncing

Comments

@amberin
Copy link
Member

amberin commented Oct 14, 2023

I just verified this by running "git gc" on my main repo in a long running Orgzly install. The repo shrank from 18 to 3 MB.

We should be able to trigger gc but let Git decide whether it's necessary. That way it should not need to prolong the typical syncing experience.

@amberin amberin added the bug Something isn't working label Oct 14, 2023
@amberin
Copy link
Member Author

amberin commented Oct 14, 2023

Just like I thought, JGit is supposed to check during common porcelain commands (e.g. fetch) if garbage collection is needed, and perform it automatically, just like Git does.

But it seems that the default threshold value for when garbage collection is triggered (6700 loose objects) might be a bit high for the mobile use case. Either that, or JGit or Orgzly is handling the Git repo in a particularly inefficient manner. Because I am fairly sure that Orgzly's syncing has been sped up the times that I have performed garbage collection on repo clones which have been used for a year or so.

I looked at my laptop clone of my above-mentioned repo, and it was 48 MB. I ran git gc --auto on it, but Git did not consider housekeeping necessary. I then lowered the loose object threshold from 6700 to 3300 (by running git config gc.auto 3300), and then ran git gc --auto again. This time, housekeeping was performed, and the repo size shrunk to 4,6 MB.

So I suppose we could consider setting a lower threshold, and garbage collection should happen more often. But testing this to find a good value is obviously quite tricky.

One possible heuristic for deciding how often to perform manual garbage collection is suggested on Stack Overflow:

What I'd do is run it now, then a week from now take a measurement of disk utilization, run it again, and measure disk utilization again. If it drops 5% in size, then run it once a week. If it drops more, then run it more frequently. If it drops less, then run it less frequently.

I suppose we could use a similar method: Set a slightly lower gc.auto value, perform manual garbage collection, and then check after a certain amount of time how much space can be saved by another manual round. And then progressively lower the value until the observed disk space saving is negligible.

@amberin amberin changed the title Git: Garbage collection is never performed Garbage collection is never performed Oct 14, 2023
@amberin amberin added the Git repos Related to Git syncing label Oct 14, 2023
@amberin amberin changed the title Garbage collection is never performed Garbage collection is not performed often enough Oct 14, 2023
@amberin amberin changed the title Garbage collection is not performed often enough Garbage collection may not be performed often enough Oct 14, 2023
@amberin amberin linked a pull request Nov 4, 2023 that will close this issue
amberin referenced this issue in amberin/orgzly-android-revived Nov 26, 2023
I just noticed that my day-to-day repo could be shrunk from >6 MB to 2.7
MB by running a regular "git gc". This means that more than half of its
content was garbage, which should have been collected.

I also confirmed that a recent clone of the repo is much more nippy in
Orgzly.

Related to issue #27.
@amberin amberin linked a pull request May 20, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Git repos Related to Git syncing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant