Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deprecations in SoftCosineSimilarity #2938

Closed
piskvorky opened this issue Sep 9, 2020 · 6 comments · Fixed by #2940
Closed

Fix deprecations in SoftCosineSimilarity #2938

piskvorky opened this issue Sep 9, 2020 · 6 comments · Fixed by #2940
Labels
bug Issue described a bug housekeeping internal tasks and processes impact MEDIUM Big annoyance for affected users reach HIGH Affects most or all Gensim users
Milestone

Comments

@piskvorky
Copy link
Owner

When running our CI test suite, I see an array of deprecation warnings:

https://dev.azure.com/rare-technologies/gensim-ci/_build/results?buildId=287&view=logs&jobId=f9575ddc-dec8-54e6-9d26-abb8bdd9bed7&j=f9575ddc-dec8-54e6-9d26-abb8bdd9bed7&t=180156a9-2bf9-537d-c84a-ef9e808c0367

Some are from gensim, some from scipy:
Screen Shot 2020-09-09 at 09 16 00

@Witiko could you please have a look? Is it something your existing PR already addresses?

If not, can you please fix those? Thanks.

@piskvorky piskvorky added bug Issue described a bug impact MEDIUM Big annoyance for affected users reach LOW Affects only niche use-case users labels Sep 9, 2020
@piskvorky piskvorky added this to the 4.0.0 milestone Sep 9, 2020
@piskvorky
Copy link
Owner Author

piskvorky commented Sep 9, 2020

…and we're also still using the deprecated smart_open.smart_open apparently, CC @mpenkov :

gensim/test/test_api.py::TestApi::test_load_dataset
gensim/test/test_api.py::TestApi::test_multipart_load
  d:\a\1\s\.tox\py37-win\lib\site-packages\smart_open\smart_open_lib.py:254: UserWarning: This function is deprecated, use smart_open.open instead. See the migration notes for details: https://github.com/RaRe-Technologies/smart_open/blob/master/README.rst#migrating-to-the-new-open-function
    'See the migration notes for details: %s' % _MIGRATION_NOTES_URL

@piskvorky
Copy link
Owner Author

piskvorky commented Sep 9, 2020

And more:

gensim/test/test_datatype.py: 2 warnings
gensim/test/test_doc2vec.py: 14862 warnings
gensim/test/test_fasttext.py: 96 warnings
gensim/test/test_keyedvectors.py: 20 warnings
gensim/test/test_word2vec.py: 84 warnings
  D:\a\1\s\gensim\models\keyedvectors.py:1478: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
    fout.write(utils.to_utf8(prefix + str(key)) + b" " + row.tostring())

The whole CI log could do with a careful inspection.

@piskvorky piskvorky added reach HIGH Affects most or all Gensim users and removed reach LOW Affects only niche use-case users labels Sep 9, 2020
@Witiko
Copy link
Contributor

Witiko commented Sep 9, 2020

@piskvorky I can remove the deprecated parameters together with their tests and open a PR. Please note that this will break backwards compatibility, i.e. the PR should only be merged if we are going for the next major release of Gensim.

@piskvorky
Copy link
Owner Author

Thanks! Yes, we are going for the next major release. Current develop no longer supports py2.x etc, it's all 4.0.0+.

@mpenkov mpenkov added the housekeeping internal tasks and processes label Sep 10, 2020
@Witiko
Copy link
Contributor

Witiko commented Sep 10, 2020

Most of the deprecation warnings are fixed by #2940 except ...

…and we're also still using the deprecated smart_open.smart_open apparently, CC @mpenkov :

gensim/test/test_api.py::TestApi::test_load_dataset
gensim/test/test_api.py::TestApi::test_multipart_load
  d:\a\1\s\.tox\py37-win\lib\site-packages\smart_open\smart_open_lib.py:254: UserWarning: This function is deprecated, use smart_open.open instead. See the migration notes for details: https://github.com/RaRe-Technologies/smart_open/blob/master/README.rst#migrating-to-the-new-open-function
    'See the migration notes for details: %s' % _MIGRATION_NOTES_URL

... this one, which originates from api.load("__testing_matrix-synopsis", return_path=True) where api is gensim.downloader. This likely needs to be fixed in gensim-data.

@piskvorky
Copy link
Owner Author

piskvorky commented Sep 10, 2020

Awesome, thanks!

except … this one, which originates from api.load("__testing_matrix-synopsis", return_path=True) where api is gensim.downloader.

It's probably the test line a bit lower down, without the return_path=True. That line will load the __testing_matrix-synopsis dataset, which does indeed call smart_open.smart_open():
https://github.com/RaRe-Technologies/gensim-data/releases/tag/__testing_matrix-synopsis

Gensim-data releases are immutable, so we should make a new release of the __testing_matrix-synopsis dataset, with smart_open.smart_open replaced by smart_open.open. Not urgent, and definitely not blocking for this PR.

@mpenkov feel free to merge #2940 when ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue described a bug housekeeping internal tasks and processes impact MEDIUM Big annoyance for affected users reach HIGH Affects most or all Gensim users
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants