Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom 404 for subdomains #353

Closed
bdarnell opened this issue Mar 12, 2013 · 22 comments
Closed

Custom 404 for subdomains #353

bdarnell opened this issue Mar 12, 2013 · 22 comments
Assignees
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code

Comments

@bdarnell
Copy link

I just moved www.tornadoweb.org to readthedocs using a CNAME. This broke some links (which is not ideal but I can live with it), but the resulting 404 page is not helpful (e.g. www.tornadoweb.org/documentation/). The 404 page for a recognized subdomain/cname should include a link to that hostname's root. (of course, custom redirects for my domain would be awesome too)

@ericholscher
Copy link
Member

Yea, I've thought a little about how to improve the 404 pages. I think also
having logic that looks for similarly named files and auto-redirects before
404ing would be pretty awesome as well. 404s for transferring projects give
me a big sad :(

Can you explain exactly what you want in the page? Basically just more
information about what the domain is, and other places on the domain they
might want to look? This seems sane to me.

Cheers,
Eric

On Mon, Mar 11, 2013 at 6:49 PM, bdarnell [email protected] wrote:

I just moved www.tornadoweb.org to readthedocs using a CNAME. This broke
some links (which is not ideal but I can live with it), but the resulting
404 page is not helpful (e.g. www.tornadoweb.org/documentation/). The 404
page for a recognized subdomain/cname should include a link to that
hostname's root. (of course, custom redirects for my domain would be
awesome too)


Reply to this email directly or view it on GitHubhttps://github.com//issues/353
.

Eric Holscher
Maker of the internet residing in Portland, Or
http://ericholscher.com

@bdarnell
Copy link
Author

It would be nice if the page could be branded as "Tornado" instead of "Read the Docs", and have the most prominent link on the page go to www.tornadoweb.org/ (the link at the top goes to the right place, but it looks like you're on the wrong site now). If I could upload my own html file that would be ideal (and then assuming I can run javascript I could do my own redirects from /documentation/ to /en/branch2.4/).

@ericholscher
Copy link
Member

It would be pretty simple to handle smarter 404 logic, by adding it to this function: https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/core/views.py#L437 -- It would be nice to try and figure out a proper page to redirect to automatically, or at least give some possible pages they might want in the response as well.

@lolsborn
Copy link

lolsborn commented May 1, 2014

It would be pretty nice if it was possible just to redirect to the projects custom 404.html if it exists. That way I can just create a 404.rst and add something to it that at least keeps the user on a page that looks like the rest of the site.

@ericholscher
Copy link
Member

Thinking about this now, it should be pretty simple to do. We can look for a 404.html in the root directory of your docs. I've also thought it might be neat to create a customized 404 from RTD that's themed in your docs theme, which would fix a lot of the issues around breaking style during 404.

@ericholscher
Copy link
Member

Looked into this. The 404.html at the root has relative URL's for the media files, so we can't use it for a generic 404 page. Need to either post-process the 404.html to make the media links absolute, or proxy the page somehow in a way that doesn't break things.

@ericholscher
Copy link
Member

I think the best option would be to build a Sphinx extension that inserts a 404.rst if it doesn't exist, and then rewrites the linked media files on output. I'd be happy to integrate this into RTD if someone writes it :)

@ericholscher
Copy link
Member

ericholscher commented Nov 13, 2017

I wrote an initial implementation of this that needs some cleanup to deploy:

def html_collect_pages(app):
        return [('404', {'body': '<h1>Page not found</h1>\n\nThanks for trying.'}, 'page.html')]

def finalize_media(app, pagename, templatename, context, doctree):
    """ Point media files at our media server. """

    def pathto(otheruri, resource=False, baseuri='/'):
       "Hack pathto to display absolute URL's"
       if resource and '://' in otheruri:
           # allow non-local resources given by scheme
           return otheruri
       elif not resource:
           otheruri = app.builder.get_target_uri(otheruri)
       if otheruri and otheruri[0] != '/':
           otheruri = '/' + otheruri
       uri = otheruri or '#'
       return uri

    if pagename == '404':
         context['pathto'] = pathto


def setup(app):
    app.connect('html-collect-pages', html_collect_pages)
    app.connect('html-page-context', finalize_media)

@ericholscher
Copy link
Member

It will also need updates to our nginx 404 settings, but getting the 404 pages working at any URL is the first step.

@humitos
Copy link
Member

humitos commented Jan 2, 2019

We can look for a 404.html in the root directory of your docs.

Adding some context here, this is what GitHub does: https://help.github.com/articles/creating-a-custom-404-page-for-your-github-pages-site/

@stsewd
Copy link
Member

stsewd commented Jan 2, 2019

We need to consider that we host docs per version, I guess we could have this:

  • /en/v1/no-existing -> /en/v1/404.html
  • /en/no-existing -> /en/default-version/404.html

And we could fallback to the own rtd 404 page if users don't have one.

@humitos
Copy link
Member

humitos commented Jan 10, 2019

Some context about our current setup

  1. first tries to serve an static HTML via NGINX,
  2. then fallbacks to Django which checks for redirects,
  3. if there is no redirects a returns a rendered version of 404.html

NOTE: some pieces of this configuration are not public.


Considerations

  1. 404 custom pages will on branding the docs completely (probably more important in the corporate site)
  2. all URLs have to be hardcoded (if you 404 on a /foo/bar/baz.html, and we load a 404 handler generated with relative paths in /404.html, the links for, say, js/something.js will be for /foo/bar/js/something.js -- so it's not straight forward how to use sphinx for this)

Idea of a potential solution

From Read the Docs server/config side, at step 3) of our current setup, we could check if the project has already a 404.html page under (resolve_path(project, version_slug=version.slug, language=language, filename='404.html') and serve it directly.

NOTE: I'm considering version_slug to get the 404.html file which may not make sense, and the default_version should be used (or at least, fallback to the default one)

From the user side, a plain HTML (with hardcoded URLs) has to be provided at /404.html. This file could be generated from a .rst if we write an Sphinx extension that convert all the relatives URLs to absolute ones based on some configs like domain, language and version.

If we could write this extension I think it would be good UX from the user perspective.

NOTE: this idea will only works with Sphinx, though.

@humitos humitos self-assigned this Jan 16, 2019
@humitos
Copy link
Member

humitos commented Jan 19, 2019

From the user side, a plain HTML (with hardcoded URLs) has to be provided at /404.html. This file could be generated from a .rst if we write an Sphinx extension that convert all the relatives URLs to absolute ones based on some configs like domain, language and version.

I worked on this and I created the Sphinx extension (sphinx-notfound-page) for this based on @ericholscher's solution from this issue.

You can see a live example under Read the Docs: https://test-builds.readthedocs.io/en/custom-404-page/

This example does not everything we need yet because RTD source code needs some update as well to serve this 404.html page on all not found page.

If you check the source code of the 404 page served by the example, you will see that all the links are absolute. Example,

  <link rel="stylesheet" href="/en/latest/_static/css/theme.css" type="text/css" />
  <link rel="stylesheet" href="/en/latest/_static/pygments.css" type="text/css" />
  <link rel="index" title="Index" href="/en/latest/genindex.html" />
  <link rel="search" title="Search" href="/en/latest/search.html" /> 

So, it seems the only missing piece here is to modify the RTD source code to find this file and serve it if it does exists.

@humitos
Copy link
Member

humitos commented Feb 12, 2019

The PR was merged and deployed. I'd like to hear back from users that wanted to have a custom 404 if you were able to configure it properly: dropping a 404.html on the root of your documentation's output with absolute URLs for resources should be enough.

@kdheepak
Copy link

kdheepak commented Mar 5, 2019

Works great!

@Solosneros
Copy link

@humitos We created a custom 404.rst in root and let ReadtheDocs render the pages. All paths that start like /en/latest/pagethatdoesnotexist show the 404.html perfectly in the theme but when that /en/latest part is missing the 404.html gets shown but the css is broken and the theme is missing. Any hints how to fix that? :)

@humitos
Copy link
Member

humitos commented Mar 8, 2019

@Solosneros yes, you have to use absolute links to make the resources load properly. This is the most important part.

You have 2 options to achieve this:

  1. create the HTML by hand hard-coding all the URL resources (and adding that 404.html page as a static file using the config html_extra_path)
  2. use the extension https://github.com/rtfd/sphinx-notfound-page that automatically generates the page for you with the proper URLs

@Solosneros
Copy link

Solosneros commented Mar 8, 2019

@humitos thanks for the fast reply. I tried using your extension and the build keeps failing. Is the extension automatically included in ReadtheDocs.org?

Could not import extension sphinx-notfound-page (exception: No module named 'sphinx-notfound-page')

@kdheepak
Copy link

kdheepak commented Mar 8, 2019

You probably need to add it to a requirements.txt file and let rtd know where that file is located.

@humitos
Copy link
Member

humitos commented Mar 10, 2019

@Solosneros no, you have to install it as any other dependency (https://docs.readthedocs.io/en/latest/guides/specifying-dependencies.html) as @kdheepak mentioned.

@Solosneros
Copy link

@humitos thanks, it works now :)

@krzychb
Copy link

krzychb commented Mar 18, 2019

@humitos I was just looking for custom 404 page support.
Thank you for this extension 👍

Separate thanks to @bdarnell for his foresight requesting this feature 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code
Projects
None yet
Development

No branches or pull requests

10 participants