Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix parallel read error and add test #112

Merged
merged 2 commits into from
Oct 1, 2020
Merged

Conversation

rscohn2
Copy link
Contributor

@rscohn2 rscohn2 commented Sep 30, 2020

Resolves #111

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super cool! Thanks for taking the time to research about this and work in a solution! 😄

@@ -27,6 +28,10 @@ def remove_sphinx_build_output():
shutil.rmtree(build_path)


@pytest.mark.sphinx(srcdir=srcdir)
def test_parallel_build():
subprocess.check_call('sphinx-build -j 2 -W -b html tests/examples/parallel-build build', shell=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we call app.build here instead, with a parallel argument here in some way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, similar to what we are doing in the rest of the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other tests rely on SphinxTestApp, which does not have a way to pass parallel=2 through to the Sphinx constructor. I submitted a PR to add it, but we would have to wait for the next sphinx release, and it would not be usable for your testing that runs on older versions of sphinx. You can see a discussion of the problem in a PR for another extension: executablebooks/sphinx-book-theme#225 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Makes sense. So, should we check stderr or exit code or similar here to be sure that the subprocess didn't failed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, check call_raises an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. And I added the test first and verified that it caught the problem before adding the fix.

def merge_other(self, app, env, docnames, other):
"""Merge in specified data regarding docnames from a different `BuildEnvironment`
object which coming from a subprocess in parallel builds."""
env.metadata.update(other.metadata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of other collector that works in parallel and implement this method? It would be good to put it here as reference as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find an example, so I read the documentation and printed out values while processing documents. Maybe @jakobandersen would be willing to look at it, because he diagnosed the problem here: sphinx-doc/sphinx#8256

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with what this extension is doing, so perhaps the following is irrelevant: technically you should not copy all data from other, but only the data related to the documents in docnames. However, I don't think I ever ran into a case where this in practice didn't mean "copy everything from other", so maybe is really is a guarantee that other at most contains data about those documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be easier to add the filter than to prove whether or not it is needed. I will submit another PR.

@rscohn2 rscohn2 mentioned this pull request Oct 1, 2020
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again!

tests/test_urls.py Show resolved Hide resolved
notfound/extension.py Show resolved Hide resolved
Added some small TODO comments to come back to this in the future and have some context.
@humitos humitos merged commit 842ff84 into readthedocs:master Oct 1, 2020
@rscohn2
Copy link
Contributor Author

rscohn2 commented Oct 1, 2020

Thanks for making the effort to share the notfound extension. We use it here: https://spec.oneapi.com/versions/latest/index.html. The PDF is 1800 pages long and parallel read reduces processing time from 4.5 to 3.5 minutes, which helps when we are editing the doc.

@samccann
Copy link

Hi @humitos - do you have a target date for when this fix will be in an official release? We're hitting this now on the Ansible documentation builds (with 0.5) and running the same doc builds with master here seems to fix it.

I'm not a coder so can't give much help in that regard, but if there's some noncoding help you need for this, let me know. Glad to help out on something we depend on :-)

@humitos
Copy link
Member

humitos commented Jan 4, 2021

@samccann I just released 0.6 that includes this change. Thanks for contacting us. Let me know if everything works as expected.

@samccann
Copy link

samccann commented Jan 4, 2021

@humitos thanks yes it all works now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parallel read error
4 participants