Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote objects are never destroyed, which means Repository is never destroyed #403

Closed
mduggan opened this issue Aug 14, 2014 · 5 comments
Closed

Comments

@mduggan
Copy link
Contributor

mduggan commented Aug 14, 2014

This is a similar issue to #321 #326 #328, but the cause is different

When I do this:

import pygit2, sys
repo = pygit2.Repository('libgit2')
print "refcount before: %d" % sys.getrefcount(repo)
for i in range(1,100):
    len(repo.remotes)

print "refcount after: %d" % sys.getrefcount(repo)

I get this:

~/src$ python testgit.py 
refcount before: 2
refcount after: 101

The root cause is the self._self_handle in the Remote object. Because it's an opaque reference through ffi, it seems that the python garbage collector can't detect the circular reference of the Remote to itself. That means it never gets cleaned up, and it leaks a reference to the repo in self._repo, so that never gets cleaned up either.

@mduggan
Copy link
Contributor Author

mduggan commented Aug 14, 2014

A side-note.. the memory leak needs fixing either way, but does the Remote actually need the _repo member? It's not referenced anywhere..

@carlosmn
Copy link
Member

A git_remote needs its git_repository to outlive it. The only way to express this in python is via this reference to the Repository which spawned it.

@carlosmn
Copy link
Member

The fix looks like it would be to keep the callback payload as None until we perform a fetch and clear it out afterwards.

@mduggan
Copy link
Contributor Author

mduggan commented Aug 16, 2014

Ah yes, having the reference makes sense.

Since a few functions need the callbacks and there are various exceptions etc, maybe have a decorator like @needs_callbacks that can do the setup/cleanup of the payload? Unless that sounds like a terrible idea I'll try and do a PR with something like that.

@carlosmn
Copy link
Member

Those callbacks are only for fetch, so we shouldn't need anything so complicated. The only thing the reference really needs to outlive is the C.git_remote_fetch() call, which is where it's going to be used to call the callbacks.

There's a way to retrieve the set callbacks, but the simplest thing to do is probably to move the callback setting to the beginning of fetch() and then unset the callbacks structure and unreference the handle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants