Skip to content

Commit

Permalink
Build: standardize output directory for artifacts (#9888)
Browse files Browse the repository at this point in the history
* Build: standardize output directory for artifacts

Always expose the final artifacts under `_readthedocs/<format>` directory, where
`<format>` can be one of the following:

- html
- htmlzip
- pdf
- epub
- json

This allows all users of the platform to have the same behavior. No matter if
they are using the default Sphinx/MkDocs builders or a custom one by using
`build.commands`.

Besides, this commit removes the "intermediary" directory used _before_
uploading the files to S3. This allows users to modify the output files directly
from `_readthedocs/<format>` using `build.jobs.post_build`, for example.

- `BaseBuilder.type` is removed since it was not required anymore.
- There is no `old_artifact_path` anymore since there is no intermediary folder.
- The `sphinx-build` command is executed from path where the repository is
  cloned, instead from the the directory where the `conf.py` lives. This is to
  simplify the code and be able to use relative path for the build output
  directory (e.g. `sphinx-build -b html ... docs/ _readthedocs/html`).
- A new `BaseSphinx._post_build` method was added to cleanup the output
  directory (leave only one .epub, .html and .pdf file before moving forward).
- These changes should allow us to support _all formats_ for `build.commands`.
  We would need to check if these standard directories exist, set these formats
  in the version and upload their content.
- Upload step now validate there is one and only one PDF, ePUB and ZIP file when
  they are built. Currently, we only support one file per version. In the future
  we could remove this restriction. It fails silently for now, which is not
  great. We need to refactor this code a little to fail the build properly and
  communicate this to the user. I think it's _the correct way_ of doing it.
- This could also allow to support other ways to build PDF (SimplePDF or
 `tectonic`, see #9598) and
  also ZIP (e.g. zundler, see
  readthedocs/meta#77 (comment)).

Closes #9179

* Build: decideif there is an output type based on the path existence

* Test: update build tests to match changes

* Test: check if the file exist before continue

* Build: simplify the code by running the commands inside the container

* Test: add checks for more commands

* Storage: use constants to make explicit artifact types

* PDF builder: raise an error if the PDF file is not found

* Build: use `relative_output_dir` and `absolute_output_dir`

* Builder: execute `sphinx-build` from the same directory as `conf.py`

I proposed to changed this CWD to be the one where the repository was cloned,
but that could lead to some unexpected behavior. So, it's better to be
conservative here and keep the CWD as it was.

There are some small downsides for this decision, but it's better than breaking
people's builds :)

* HTMLZip build: use `zip` inside the Docker container to build it

* Minor changes about docstring and final dot in a sentence :)

* Test: adapt them to match thew path and arguments changes

* pre-commit missing changes

* Sphinx builder: better default

* Update readthedocs/doc_builder/backends/sphinx.py

Co-authored-by: Eric Holscher <[email protected]>

* Update readthedocs/projects/tests/test_build_tasks.py

Co-authored-by: Eric Holscher <[email protected]>

* Minor changes to fix tests

* Docstring: explain why there is an exception handling at this place

Co-authored-by: Eric Holscher <[email protected]>
  • Loading branch information
humitos and ericholscher authored Jan 19, 2023
1 parent b89fed4 commit 239d890
Show file tree
Hide file tree
Showing 11 changed files with 420 additions and 301 deletions.
16 changes: 16 additions & 0 deletions readthedocs/builds/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,3 +148,19 @@
MAX_BUILD_COMMAND_SIZE = 1000000 # This keeps us under Azure's upload limit

LOCK_EXPIRE = 60 * 180 # Lock expires in 3 hours

# All artifact types supported by Read the Docs.
# They match the output directory (`_readthedocs/<artifact type>`)
ARTIFACT_TYPES = (
"html",
"json",
"htmlzip",
"pdf",
"epub",
)
# Artifacts that are not deleted when uploading to the storage,
# even if they weren't re-built in the build process.
UNDELETABLE_ARTIFACT_TYPES = (
"html",
"json",
)
9 changes: 2 additions & 7 deletions readthedocs/doc_builder/backends/mkdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,6 @@ class BaseMkdocs(BaseBuilder):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.yaml_file = self.get_yaml_config()
self.old_artifact_path = os.path.join(
os.path.dirname(self.yaml_file),
self.build_dir,
)

# README: historically, the default theme was ``readthedocs`` but in
# https://github.com/rtfd/readthedocs.org/pull/4556 we change it to
Expand Down Expand Up @@ -343,9 +339,8 @@ def get_theme_name(self, mkdocs_config):

class MkdocsHTML(BaseMkdocs):

type = 'mkdocs'
builder = 'build'
build_dir = '_build/html'
builder = "build"
build_dir = "_readthedocs/html"


# TODO: find a better way to integrate with MkDocs.
Expand Down
Loading

0 comments on commit 239d890

Please sign in to comment.