Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdf migrations #21

Merged
merged 5 commits into from
Nov 12, 2024
Merged

pdf migrations #21

merged 5 commits into from
Nov 12, 2024

Conversation

jamesthebrooks
Copy link
Contributor

Changes proposed in this pull request:

  • migrated pdf files from jekyll site

@jamesthebrooks jamesthebrooks changed the title df migration pdf migrations Oct 24, 2024
@sknep
Copy link

sknep commented Oct 29, 2024

PDFs make me sad. Would muuuuch rather see these converted to HTML with a redirect from that path. This meets OMB M-23-22 guidance. What do we gain by checking in these PDFs?

@jamesthebrooks jamesthebrooks force-pushed the chore-pdf-migration branch 3 times, most recently from 26e9a85 to db8d87e Compare October 30, 2024 17:30
@jamesthebrooks
Copy link
Contributor Author

PDFs make me sad. Would muuuuch rather see these converted to HTML with a redirect from that path. This meets OMB M-23-22 guidance. What do we gain by checking in these PDFs?

I don't have an answer or opinion on that. That said, if we're going to do this as a feature upgrade, I would prefer to create a new ticket outside of this epic to approach that upgrade. That way we can specify both requirements and acceptance criteria without muddying the migration effort and both can have acceptance criteria. Then we can define things like PDF style guides, headers, etc. and someone on the team can have a clearly defined goal.

@sknep
Copy link

sknep commented Nov 6, 2024

new ticket outside of this epic

Converting PDFs to HTML pages was always on the list of things that needed to happen for this work, as I described it a while back:

... specifically: migrate PDF content to HTML, delete the old PDFs

Either we do it now or we do it later/soon, but all PDF content needs to be remediated to HTML if we want to be compliant on this site. Can you scope how long it would take you to get a fair web experience up for this as a collection in HTML, not PDF? I understand the designs wouldn't be 1-1. Can you make an estimate of how long it would take and what state the content would be in, in terms of quality of experience?

define things like PDF style guides, headers, etc

Do you mean like... re-export the PDFs with a consistent style? OR just how the content that is currently in PDFs ends up in HTML?

@apburnes
Copy link

apburnes commented Nov 7, 2024

@sknep we don't have a functionality currently to redirect from the old PDFs to the new HTML page. Can we create the HTML representations of the old PDFs and leave the PDFs for now while we think about how we could accomplish it?

@jamesthebrooks
Copy link
Contributor Author

For today, do we want to just copy PDF text content out and put in a file /resources/{original-file-name-without-extension}? For unlinked files, just delete them?

@sknep
Copy link

sknep commented Nov 7, 2024

we don't have a functionality currently to redirect from the old PDFs to the new HTML page

Let's try with a meta refresh / redirect and see if the mimetypes mess it up: https://www.sitebuilderone.com/blog/11ty-url-redirections.html

@jamesthebrooks
Copy link
Contributor Author

jamesthebrooks commented Nov 7, 2024

new ticket outside of this epic

Converting PDFs to HTML pages was always on the list of things that needed to happen for this work, as I described it a while back:

... specifically: migrate PDF content to HTML, delete the old PDFs

Either we do it now or we do it later/soon, but all PDF content needs to be remediated to HTML if we want to be compliant on this site. Can you scope how long it would take you to get a fair web experience up for this as a collection in HTML, not PDF? I understand the designs wouldn't be 1-1. Can you make an estimate of how long it would take and what state the content would be in, in terms of quality of experience?

define things like PDF style guides, headers, etc

Do you mean like... re-export the PDFs with a consistent style? OR just how the content that is currently in PDFs ends up in HTML?

Right now, I think I would like to discuss with the team. After talking with you yesterday, I was thinking that this was going to be a simpler task. None of the content seems to be duplicate for existing html pages, and after talking with @apburnes today, it doesn't seem like it's currently possible to redirect from a missing pdf anyway so we would need to spec out how that would work as well. So it seems like as of right now, what I am clear on:

  • Cannot redirect from PDFs
  • Can delete unlinked files that seem to be referenced nowhere
  • Can convert PDF content to HTML, but would like some requirements around this such as:
    • page layout and styling requirements
    • any possible requirements around images

If it's as simple as "keep paragraphs and images and forget about other details", that's good enough for me. I'd just like to know before I start.


Revised:

Won't be doing PDF -> HTML/MD conversions for now. Will:

  1. delete PDFs not linked to
  2. attempt a meta refresh if it's simple to redirect them to another page where possible. will attempt the next closest related topic

@@ -0,0 +1,5 @@
---
permalink: /assets/documents/Office_Hours_EC_ES_Tech_Talk.pdf/index.html
redirect: /docs/overview/customer-service-objectives/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,5 @@
---
permalink: /assets/documents/cloudgov-vendors-2019.pdf/index.html
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://cloud.gov/assets/documents/cloudgov-vendors-2019.pdf doesnt' exist.... is this the right source URL?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the permalink should say:

permalink: /resources/cloudgov-vendors-2019.pdf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the permalink. Good catch! Where should the redirect go?

@@ -0,0 +1,5 @@
---
permalink: /assets/documents/federalist-system-architecture.pdf/index.html
redirect: /pages/documentation/how-pages-works/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asking about this one in slack. It might already exist somewhere else, like on our diagrams site at https://diagrams.fr.cloud.gov/

@@ -0,0 +1,5 @@
---
permalink: /assets/documents/how-federalist-works-for-presentation.pdf/index.html
redirect: /pages/documentation/how-pages-works/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this redirect to the PDF you're including at:

_assets/documents/how-pages-works-diagram.pdf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure honestly. We can delete how-pages-works-diagram.pdf and redirect to /pages/documentation/how-pages-works/. For how-federalist...pdf, this was just a best guess.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, federalist is the old name of Pages, so i think it has just been replaced by newer content 2x

Copy link

@sknep sknep Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should definitely delete this one! no redirect! into the ether!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete and redirect this one to https://cloud.gov/pages/success-stories/#afwerx

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's delete and redirect this one to the pages homepage, cloud.gov/pages

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete and redirect this one to https://cloud.gov/pages/success-stories/#afwerx

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets delete and redirect to https://cloud.gov/pricing/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these diagrams might not have a good replacement, in which case a ticket for me to rebuild them in something accessible is the right call. do you know where these are used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is the only place they're referenced: https://cloud.gov/docs/compliance/diagrams/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloud-gov/system-diagrams#164

I don't expect this to get closed soon, so we can keep this diagram for now. When that ticket is addressed, we can come back to this.

@sknep sknep self-requested a review November 12, 2024 16:31
Copy link

@sknep sknep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is shippable!

btw next time -- feel free to dismiss and re-request reviews if you're waiting to hear back, I didn't realize you were done and waiting on me

@jamesthebrooks jamesthebrooks merged commit fbffc32 into main Nov 12, 2024
3 checks passed
@sknep sknep deleted the chore-pdf-migration branch November 12, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants