Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature flags to reduce dependencies #934

Closed
wants to merge 50 commits into from

Conversation

traviscross
Copy link
Contributor

We have 223 unique external direct and transitive dependencies. Many
of these are needed only for serving files via HTTP. Others are
needed only for making outbound HTTP/S requests. In certain
situations, such as when building Zola as part of a continuous
integration pipeline, these features are unnecessary, and we would
prefer a faster and simpler build.

In this commit, we add serve and request feature flags to Zola.
The serve flag controls whether the embedded HTTP server is built
and whether the zola serve command is available. The request flag
controls whether we support making outbound HTTP/S requests, such as
for link checking. Both of these features are enabled by default.

With both of these features disabled, such as when building with
--no-default-features, we require only 160 unique external
dependencies, a 28% reduction. Notably, we're able to drop both
openssl and tokio. Dropping openssl in particular can help with
producing statically-linked builds.

With the serve flag, we depend on 210 unique external crates. With
the request flag, we depend on 199 unique external crates.

Keats and others added 30 commits October 24, 2019 23:13
Presently when you `^C` in `zola serve` it is painted with the same
color as the previous message. This PR always ensures to reset the color
in colorize, before writing the newline.
…eme (getzola#817)

Links that start with a scheme (e.g., `tel:18008675309`) inadvertently
had a URL prefix prepended. Previously, only `mailto:` was handled, but
given the sheer number of [registered URI schemes][uri-schemes], a loose
pattern matcher is used to detect schemes instead.

External links, as identified by the renderer, are now limited to `http`
and `https` schemes.

Fixes getzola#747 and fixes getzola#816.

[uri-schemes]: https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml
…zola#818)

* Let toc is visable through Page & Section variables in templates

* Removed the current toc variable from page & section
* cargo fmt & clippy

* Skip anchor checking for URL with prefix in config
"[…] `&` normally indicates the start of a character entity reference or
numeric character reference; writing it as `&` […] allows `&` to be
included in the content of an element or in the value of an attribute."

From: https://en.wikipedia.org/wiki/HTML#Character_and_entity_references
* feat(pagination): Add `total_pages` in paginator object

* feat(pagination): Added doc for `total_pages`

* feat(pagination): Added test for `total_pages`
* fix(init):  handle already existing path

* chore: add tests
* Update installation.md

* Update cli-usage.md

* Update installation.md

* Update directory-structure.md

* Update configuration.md

* Update overview.md

* Update section.md

* Update page.md

* Update section.md

* Update configuration.md

* Update page.md

* Update section.md

* Update page.md

* Update shortcodes.md

* Update linking.md

* Update table-of-contents.md

* Update syntax-highlighting.md

* Update taxonomies.md

* Update search.md

* Update sass.md

* Update index.md

* Update multilingual.md

* Update overview.md

* Update pages-sections.md

* Update pagination.md

* Update taxonomies.md

* Update rss.md

* Update sitemap.md

* Update robots.md

* Update 404.md

* Update archive.md

* Update overview.md

* Update installing-and-using-themes.md

* Update creating-a-theme.md

* Update netlify.md

* Update github-pages.md

* Update gitlab-pages.md

* Update index.md

* Update page.md

* Update section.md

* Updates.
* Compute canonical path before adjusting parent path

* Don't use adjusted `parent` to recalculate `canonical` in `find_language`

* Add regression tests

- Test for correct canonical field when calling `new_page`
- Test for correct canonical field after calling `find_language`
* Add path to `TranslatedContent`

This makes it possible to retrieve the translated page through the `get_page` function.

* Use TranslatedContent::path field in test_site_i18n

Use it with the `get_page` function to get a reference to the page object.
This addresses the following Clippy warnings:

* clippy::option_and_then_some
* clippy::useless_format
* maybe_slugify() only does simple sanitation if config.slugify is false

* slugify is disabled by default, turn on for backwards-compatibility

* First docs changes for optional slugification

* Remove # from slugs but not &

* Add/fix tests for utf8 slugs

* Fix test sites for i18n slugs

* fix templates tests for i18n slugs

* Rename slugify setting to slugify_paths

* Default slugify_paths

* Update documentation for slugify_paths

* quasi_slugify removes ?, /, # and newlines

* Remove forbidden NTFS chars in quasi_slugify()

* Slugification forbidden chars can be configured

* Remove trailing dot/space in quasi_slugify

* Fix NTFS path sanitation

* Revert configurable slugification charset

* Remove \r for windows newlines and \t tabulations in quasi_slugify()

* Update docs for output paths

* Replace slugify with slugify_paths

* Fix test

* Default to not slugifying

* Move slugs utils to utils crate

* Use slugify_paths for anchors as well
Keats and others added 15 commits December 23, 2019 09:43
Certain tests involving HTTP requests were sometimes hanging
indefinitely, so this uses Mockito for HTTP mocking. This seemingly
resolves the issue and makes these tests more reliable.

The existing can_fail_404_links test has been renamed to
can_fail_unresolved_links, to represent what actually occurs in the
test. The can_fail_404_links test now deals with a proper 404
response.

Just to be clear, the check_site test in the site component will
still create outgoing HTTP requests (due to the URLs used in the
test_site), so this commit only uses HTTP mocking where possible.
* Treat 304 (Not Modified) requests as valid.

* Add tests for 301-to-200 links, 301-to-404 links, and 500 links.
This helps to test redirections and the previously-added
response.status() checking for non-success status codes in check_url().

* Make names for HTTP mock paths unique, to avoid weird behavior. It
seems like mocks with the same path can potentially bleed between
tests, so you may end up with an unexpected response which causes the
test to sometimes pass and sometimes fail.

* Fix Clippy warnings about String::from(format!()).
* Restore #![feature(test)] and extern crate test; statements, which
were mistakenly removed as part of the Rust 2018 edition migration.

* Fix rendering benchmark's usage of RenderContext. 6 parameters were
provided when 5 were expected.
hyper is already included in Zola due to the reqwest dependency (used
in the link_checker and templates components). Replacing Actix with
hyper in the serve command reduces the number of dependencies and
slightly improves build times and binary size.
The issue with the check_site test hanging and timing out seems to
be related to a similar reqwest issue, which was ultimately due to
an upstream bug in tokio and may be fixed in tokio 0.2.7 onward.
)

* Detect empty links on markdown rendering and issue an error

* Add a test for empty links stopping rendering with an error

* Assert error message is the expected one

When testing for empty links detection compare the error message
to make sure it's the correct error that stopped the process
and not some unrelated issue.
* Update installation.md

* Update cli-usage.md

* Update installation.md

* Update directory-structure.md

* Update configuration.md

* Update overview.md

* Update section.md

* Update page.md

* Update section.md

* Update configuration.md

* Update page.md

* Update section.md

* Update page.md

* Update shortcodes.md

* Update linking.md

* Update table-of-contents.md

* Update syntax-highlighting.md

* Update taxonomies.md

* Update search.md

* Update sass.md

* Update index.md

* Update multilingual.md

* Update overview.md

* Update pages-sections.md

* Update pagination.md

* Update taxonomies.md

* Update rss.md

* Update sitemap.md

* Update robots.md

* Update 404.md

* Update archive.md

* Update overview.md

* Update installing-and-using-themes.md

* Update creating-a-theme.md

* Update netlify.md

* Update github-pages.md

* Update gitlab-pages.md

* Updates.

* Skip link checking for URL with prefix in config (getzola#846)

* Fix some doc changes

* Section extra -> SitemapEntry (getzola#850)

* Update deps

* Remove tutorial link.

* Update overview.md

* Update page.md

* Update section.md

* Update netlify.md

* Update overview.md

* Change some wording.

* Update overview.md

Co-authored-by: Tjeu Kayim <[email protected]>
Co-authored-by: Vincent Prouillet <[email protected]>
Co-authored-by: Stan Rozenraukh <[email protected]>
We have 223 unique external direct and transitive dependencies.  Many
of these are needed only for serving files via HTTP.  Others are
needed only for making outbound HTTP/S requests.  In certain
situations, such as when building Zola as part of a continuous
integration pipeline, these features are unnecessary, and we would
prefer a faster and simpler build.

In this commit, we add `serve` and `request` feature flags to Zola.
The `serve` flag controls whether the embedded HTTP server is built
and whether the `zola serve` command is available.  The `request` flag
controls whether we support making outbound HTTP/S requests, such as
for link checking.  Both of these features are enabled by default.

With both of these features disabled, such as when building with
`--no-default-features`, we require only 160 unique external
dependencies, a 28% reduction.  Notably, we're able to drop both
`openssl` and `tokio`.  Dropping `openssl` in particular can help with
producing statically-linked builds.

With the `serve` flag, we depend on 210 unique external crates.  With
the `request` flag, we depend on 199 unique external crates.
@traviscross
Copy link
Contributor Author

traviscross commented Jan 24, 2020

In this commit, we have updated the installation documentation to reflect the available feature flags and their use.

We added these flags as part of work to support building Zola statically, so this has some relevance to issue #316. However, the benefit of having these feature flags likely stands on its own.

@traviscross
Copy link
Contributor Author

We have ensured that all tests pass on Zola and on all components in the workspace in all feature configurations. Further, we've tested building, serving, and checking the documentation in each configuration where the operation is relevant.

@Keats
Copy link
Collaborator

Keats commented Jan 25, 2020

I'm not super keen on that to be honest. I think features for Zola are only useful for some developers to avoid potential build issues, like sass-rs which is hard to build on some systems. The rest will always be built with all features enabled so there's no point splitting them.

Imo, a feature for sass (enabled by default) should be added but that's it.

@traviscross
Copy link
Contributor Author

Hi @Keats; thanks for sharing your concerns. Understand and agree that most people building or using this on their local systems will want all features enabled (I do too!). The use-case here is that we're building Zola to be run as part of a continuous integration system for building a site. The embedded HTTP server and link checker don't serve a purpose in that application. They are, in this application, actively unwanted because we don't want our CI systems accepting inbound or making outbound network requests in any case.

So even aside from reducing dependencies, that's worth it to us and probably to others using it in a similar manner.

On the dependency side, this has two main positive effects:

  • Removing OpenSSL from the dependency tree makes Zola easier to build reliably and in a statically-linked manner, which is particularly important for CI.
  • Removing the things that make or receive network requests and OpenSSL reduce Zola's attack surface and make security review and vulnerability tracking easier for the resulting binary, which is important for some people.

This patch is also a stepping stone to getting Zola building statically in its full configuration. Testing and working through the problems was made easier by this. As you mention, sass-rs is the other tricky thing to build. (I didn't carve that one out because it seemed more central to Zola's functionality than the embedded server or link checker.)

I'm expecting to submit further PRs to sass-rs and to Zola to get Zola building statically in its full configuration.

If you start to see the value after giving some thought to the use-case and benefits I've described, great. (I recognize the appeal may be to a niche group.) If not, let me know, and I'll close the PR, and we'll just maintain this patch out of tree. If you want, I'm also happy to submit a PR with a feature flag for sass, or to roll it into this PR if you think that would be better.

@Keats
Copy link
Collaborator

Keats commented Jan 25, 2020

Why build Zola from source everytime on CI though? It takes a good 5-10min on release mode vs grabbing it from one of the many package managers or just wget from the release page.

I think the only non-static part is reqwest right? I wanted to use the rustls feature (https://zola.discourse.group/t/remove-dependency-on-preinstalled-openssl/221) to get rid of openssl but it seems like some sites do not work with it; it might be ok for a SSG though?

@traviscross
Copy link
Contributor Author

traviscross commented Jan 25, 2020

Hi @Keats; we don't build it every time we build the site. We build it whenever our Zola branch changes and cache the built artifact. That cached artifact is used to build the site. As far as why we build it ourselves rather than using a distributed binary, the immediate motivation is that we're testing in-progress improvements to Zola, such as work toward a solution to #519. We'll be submitting those patches upstream as they become ready for your review.

Also, while our use-case is CI, and I focused a lot on that above, I think anyone running Zola on a server in any capacity would want to strip out at least the embedded HTTP server for the (at least perceived) security benefits mentioned. Sometimes it can be useful to trust that Zola (or any binary) is providing only a strict input -> output function.

Regarding static builds, sass-rs' dependency on libstdc++ is perhaps the bigger challenge than OpenSSL. That precludes the simple use the musl-gcc, which Rust uses by default when building statically with musl. I have managed to get Zola to build statically using musl-cross-make and some changes to Zola/sass-rs.

To get there, I found it useful to get OpenSSL out of the way, hence the origin story of this PR. However, I actually expect that getting Zola to build statically with OpenSSL won't be a big deal. We'll add a feature flag to Zola that will cause openssl-sys to build its vendored version of OpenSSL, and that will need to be passed when building statically. As you say, using rustls could be another option. It may be worth adding feature flags to select which is used (there are pros and cons to each). I can include that with the PRs I'll be submitting for the remainder of the static build work.

@traviscross
Copy link
Contributor Author

Also, it may be worth mentioning, most of the Cargo.toml plumbing in this PR is groundwork that would be needed anyway for plumbing openssl or rustls feature flags to where they are needed.

@traviscross
Copy link
Contributor Author

traviscross commented Jan 25, 2020

Apropos of the issue raised in the discourse group, the correct end result for Zola's own binary releases is probably to build them statically against OpenSSL, and perhaps against rustls when it's a bit more mature. That would avoid not just the conflict that user hit, but any conflicts with system libraries. It would also allow the Zola releases to be installed on non-gnu hosts such as Alpine Linux (which is a favorite of people using Docker... such as for CI).

Of course, that doesn't mean that the vendored OpenSSL needs to be the default for cargo build. That choice can be made separately, and it may be reasonable to use the system OpenSSL by default there.

(Actually, perhaps the ideal would be if the openssl-sys crate were to automatically fall back to its vendored copy when libssl-dev isn't available, like sass-sys does with libsass-dev.)

@Keats
Copy link
Collaborator

Keats commented Jan 26, 2020

Thanks for the background info.

For rustls I think it's mature right now, it's just that it intentionally doesn't support every algorithms on purpose, which means it might not work with some sites.
The main issue I have with the reqwest feature gating is that we changes the load_data function. reqwest is only used for zola check (a command that you probably want to run on CI, with HTTP request to check external links) and that function which both becomes something very different with or without the flag on.

I was planning to add the sass-rs feature for the next Zola version since it's such a pain to build at times and prevent people from contributing.

Rather than adding a feature gate for reqwest, I am more thinking of trying to switch it to use rustls and see if people report unexpected failures; we get rid of openssl entirely and maybe can help improve rustls with issues/bugs that way. If too many people report issues, we can back down and look again at switching to the static openssl version. I don't think I want to make that choice selectable, if someone really needs another version of reqwest, they can fork it and it's a few chars diff.

@traviscross
Copy link
Contributor Author

Thanks @Keats for the further thoughts. You're right about load_data. I hadn't fully considered what that function does with respect to outbound requests (or else I would have at least documented the changes there).

With respect to rustls and sass, that makes sense. I'm working up a commit series that switches to rustls, includes the sass feature flag, and rethinks the request feature a bit.

@Keats
Copy link
Collaborator

Keats commented Feb 5, 2020

Closing that for now as it has way too many conflicts and is better to start from scratch.

@Keats Keats closed this Feb 5, 2020
@Keats
Copy link
Collaborator

Keats commented Feb 7, 2020

@traviscross do you have an idea when/if you'll get to that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.