Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output format: PDF vs HTML #25

Open
VladimirAlexiev opened this issue Mar 28, 2024 · 4 comments
Open

output format: PDF vs HTML #25

VladimirAlexiev opened this issue Mar 28, 2024 · 4 comments

Comments

@VladimirAlexiev
Copy link
Collaborator

I personally think that HTML is a more important output format for a spec, since this allows people to cite a precise spot using a link.
All W3C specs are in HTML and tens of thousands emails and other specs and documents refer to anchors within (which W3C ensures to keep stable).
I know that ISO/IEC publish PDFs from Word templates and of course we cannot change this, but I think that it's equally important to produce HTML.

Some considerations:

  • HTML stored in Github can be rendered live with Github pages, or simply with a site like rawgit2.com (eg see https://rawgit2.com/VladimirAlexiev/shacl/shaclc-grammars/shacl-compact-syntax/grammar/shaclc-XText.html)
  • For an online versioned spec, it's better to keep the HTML text pure, and keep subsidiary resources (CSS, JS, images) as separate files
  • For offline reading and sending by email, it's better to have self-contained HTML that includes all resources inside.
    • Pandoc can do this (a very powerful command line processor)
    • SingleFile Chrome addon can do it
    • I hope Org Export also can do it, but haven't checked
  • If you agree with image format: SVG vs PNG vs PDF; display images inline in Org  #24 to make SVG the primary image format, that can be embedded directly (not as base64). Eg the link above has embedded SVG and works very well.
    • Another small benefit is that the SVGs are searchable with control-F (not important for rdfpuml images, but very important for grammar diagrams)
@johanwk
Copy link
Owner

johanwk commented Mar 29, 2024

I fully agree with your reasoning why HTML is a format that needs to be supported.

Maybe the simplest solution for standalone HTML is to use a specialised tool? I've never tried Monolith, but it fits the description!

On embedding SVGs in the exported HTML, I have been using a setup for HTML presentations using org-re-reveal, this comes with an option for self-contained HTML export and SVGs are embedded. However, apparently this option is not provided for the regular org to HTML export, at least not according to the discussion at How can I embed SVG images.

If PNGs are needed, maybe the following Reddit post contains a good solution for the base64 encoding: Standalone HTML.

@johanwk
Copy link
Owner

johanwk commented Mar 29, 2024

One challenge we have is that the formatting options for exported HTML are a bit limited. I tend to use the org-html-themes with ReadTheOrg format.

My HTML/CSS skills were last updated around 1997, so I'm not a candidate for developing a better export format :) Would it be ideal to have a format like W3C uses, e.g. for the SSN ontology?

@VladimirAlexiev
Copy link
Collaborator Author

Thanks for telling me about org-re-reveal! I still use org-reveal but will switch. I traced image embedding to these functions (which handle svg directly, and png using base64):

I think ReadTheOrg is enough. The W3C spec style is popular, but I don't think it would be easy to achieve it.

  • All newer specs use a JS package called respec that handles citations, definitions, etc
  • As far as I can tell, W3C people edit the HTML directly (poor souls!)

@johanwk
Copy link
Owner

johanwk commented Apr 9, 2024

There's actually quite a lot here that could be useful to add. It's slighly shocking that W3C expects direct HTML editing :D

In the long term, I'd like for ELOT to support XML export, using a format like the one for ISO standards, see https://www.niso-sts.org/. This is clearly possible, but would be a whole project in itself, it's not a small standards as far as I can tell.

On embedding images, there's also a snippet here, which might work, could be worth trying: https://www.reddit.com/r/orgmode/comments/15ei2gt/standalone_html_on_org_export/

Regarding respec, should we add a new issue to look into that? I had a brief look at the user guide. org-mode has a good "export filter" support, so maybe it's not difficult to output content to match. But I agree this isn't worth prioritising for quite a while yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants