Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free pointers in parser activations #4486

Merged
merged 2 commits into from
Oct 22, 2019

Conversation

adrianeboyd
Copy link
Contributor

@adrianeboyd adrianeboyd commented Oct 21, 2019

Description

Free the remaining pointers in the ActivationsC activations at the end of _parseC() in Parser.

This is the only definite spacy-related memory leak I found with valgrind using examples from #3618. (I think there may be some remaining issues related to cython/numpy.)

Types of change

Bugfix.

Checklist

  • I have submitted the spaCy Contributor Agreement.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@svlandeg svlandeg added bug Bugs and behaviour differing from documentation perf / memory Performance: memory use feat / parser Feature: Dependency Parser labels Oct 21, 2019
@honnibal
Copy link
Member

Oh awesome! Nice work. Would you consider writing up a tutorial on how to use Valgrind on a Cython project? As I mentioned in Slack, I've always wanted to use it but have had trouble getting started.

I feel sheepish about this memory leak, as it could have been discovered the old-fashioned way by just grepping for malloc and calloc. I normally try to be careful when I'm using them, because they break the normal pattern of cymem.Pool and take some vigilance. I'm not sure why I messed this up.

@adrianeboyd
Copy link
Contributor Author

I only kinda sorta know what I'm doing, but I can sketch out the basics of how I did this, sure!

@honnibal
Copy link
Member

honnibal commented Oct 21, 2019

Ah, I think I see how I missed this. The allocations and the frees are happening in different modules, which makes the mistake much more likely.

I suggest we make it a bit easier to see that the _parseC loop owns that memory. The smallest change would be to just call spacy.syntax._parser_model.resize_activations() at the start of the loop. Probably better is to add some setup and teardown functions to _parser_model.pyx:

cdef ActivationsC alloc_activations(int n_states) nogil:
    cdef ActivationsC acts
    memset(&acts, 0, sizeof(acts))
    resize_activations(&acts, n_states)
    return acts

cdef void free_activations(const ActivationsC* acts) nogil:
    free(acts.token_ids)
    free(acts.scores)
    free(acts.unmaxed)
    free(acts.hiddens)
    free(acts.is_valid)

We would then remove the resize_activations function from the _parser_model.pxd, as the allocations should be private to the module.

@adrianeboyd
Copy link
Contributor Author

However you think is best...

* Rewrite/restructure to have allocation and free in parallel functions
in `_parser_model` rather than partially in `_parseC()` in `Parser`.

* Remove `resize_activations` from `_parser_model.pxd`.
@honnibal honnibal merged commit 3dfc764 into explosion:master Oct 22, 2019
adrianeboyd added a commit to adrianeboyd/spaCy that referenced this pull request Oct 28, 2019
* Free pointers in ActivationsC

* Restructure alloc/free for parser activations

* Rewrite/restructure to have allocation and free in parallel functions
in `_parser_model` rather than partially in `_parseC()` in `Parser`.

* Remove `resize_activations` from `_parser_model.pxd`.
ines added a commit that referenced this pull request Oct 28, 2019
* Free pointers in parser activations (#4486)

* Free pointers in ActivationsC

* Restructure alloc/free for parser activations

* Rewrite/restructure to have allocation and free in parallel functions
in `_parser_model` rather than partially in `_parseC()` in `Parser`.

* Remove `resize_activations` from `_parser_model.pxd`.

* Try and fix pytest


Co-authored-by: Ines Montani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation feat / parser Feature: Dependency Parser perf / memory Performance: memory use
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants