Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blueprints: Rename importFile to importWxr, switch to humanmade/WordPress importer #1192

Merged
merged 12 commits into from
Apr 4, 2024

Conversation

adamziel
Copy link
Collaborator

@adamziel adamziel commented Apr 4, 2024

Problem addressed

The importFile step works by interacting with HTML form like a user would. Problem is, the WXR importer requires some interactions with the forms which we fake by adjusting the POST data before submitting the form. This approach causes a number of issues listed below.

More context: #1183.

Implementation

Refactors the importFile Blueprint step to:

  • Be named importWxz. The old name continues to work, though.
  • Use the humanmade/WordPress-Importer for importing content – it is
    more reliable than the official wordpress-importer and has a
    convenient PHP API.
  • Drop WXZ support – it's a maintenance burden and doesn't add clear
    value.
  • Remove dependency on DOMParser – this step should now work in wp-now

Testing instructions

  1. Go to http://localhost:5400/website-server/?import-wxr=https://raw.githubusercontent.com/helgatheviking/Simple-User-Listing/3e38ea3eb3a057424fbae4305cba2fd9f73d896b/demo-content/demo-content.xml&php=8.0&wp=6.5&storage=none&php-extension-bundle=kitchen-sink
  2. Confirm the "Simple User Listing Block" page was created
  3. Confirm in wp-admin a bunch of user accounts were created
  4. Confirm in wp-admin the WordPress Importer plugin was installed

Closes #1183
Closes #379
Closes #1158
Closes #1161

cc @dmsnell

adamziel added 3 commits April 4, 2024 01:47
…ress-Importer

Rewires the importFile step to:

* Be named importWxz. The old name continues to work, though.
* Use the humanmade/WordPress-Importer for importing content – it is
  more reliable than the official wordpress-importer and has a
  convenient PHP API.
* Drop WXZ support – it's a maintenance burden and doesn't add clear
  value.

Closes #1183
Closes #379
Closes #1158
Closes #1161
Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine. I may have missed this point

Remove dependency on DOMParser – this step should now work in wp-now

Is that a TODO still? or is this already part of this PR?

@flexseth
Copy link
Collaborator

flexseth commented Apr 4, 2024

importWxr looks good, just looking back at everything in the steps

  • Be named importWxz. The old name continues to work, though.
  • Use the humanmade/WordPress-Importer for importing content – it is
    more reliable than the official wordpress-importer and has a
    convenient PHP API.

@flexseth
Copy link
Collaborator

flexseth commented Apr 4, 2024

Interested to see the tools

@adamziel
Copy link
Collaborator Author

adamziel commented Apr 4, 2024

Is that a TODO still? or is this already part of this PR?

@dmsnell It's a part of this PR. import-wxr.ts no longer parses the HTML output using DOMParser() as import-file.ts did:

function DOM(response: PHPResponse) {
	return new DOMParser().parseFromString(response.text, 'text/html');
}

@adamziel adamziel force-pushed the switch-to-humanmade-importer branch from e8f8c96 to 9da746c Compare April 4, 2024 08:52
@adamziel
Copy link
Collaborator Author

adamziel commented Apr 4, 2024

I've removed the site-data.spec.ts because of a weird, seemingly intermittent error it produced that seemed unrelated to this PR. I'll restore it in a separate PR and let's discuss the failure there.

@adamziel adamziel merged commit fad3ccf into trunk Apr 4, 2024
5 checks passed
@adamziel adamziel deleted the switch-to-humanmade-importer branch April 4, 2024 10:50
adamziel added a commit that referenced this pull request Apr 4, 2024
The test was removed in #1192 due to intermittent failures like these:

```
FAIL  src/lib/steps/site-data.spec.ts > Blueprint step setSiteOptions() > should set the site option
```

They seem related to #1189, but all the other tests pass and the
Playground website works so I wonder what's going on there. Is the CI
runner just running out of memory? But then it shouldn't be able to
allocate the memory segment in the first place. Weird!
@flexseth
Copy link
Collaborator

flexseth commented Apr 5, 2024

Question @dmsnell and @adamziel - I was looking at the discussion about data liberation

WXR (WordPress eXtended RSS) has been around for a while - and it's done a good job. But there's certainly room for improvement. Especially when it comes to including media/attachments and certain other things.

On data liberation

Is WXR the right format?
Is WXZ the right format?
Do we currently have the right format?
Is this a good opportunity to create the right format?

Portability

Moving from one instance to another should be what Playground does.
Duplicating sites through different means

Exit plan

Format the data so the user can take it with them, wherever they go

Can we discuss? 😎

@adamziel
Copy link
Collaborator Author

adamziel commented Apr 5, 2024

@flexseth WXR is for moving content, as in posts, pages etc. It won't move custom database tables and has some problems with media files, though. I'd rather not spend time on expanding it. Site Transfer Protocol and Playground Snapshots will enable moving sites between Playgrounds and WordPress hosts.

adamziel added a commit that referenced this pull request Apr 8, 2024
Preserves the backslashes in the content imported through `importWxr` by
calling `wp_slash` on the entire imported data.

This issue started after the recent switch to the
`humanmade/wordpress-importer` in
#1192. Turns out
that importer doesn't call `wp_slash` on its own.

Closes #1211
@dmsnell
Copy link
Member

dmsnell commented Apr 10, 2024

Is WXR the right format?
Is WXZ the right format?
Do we currently have the right format?
Is this a good opportunity to create the right format?

@flexseth to add to what @adamziel wrote, I think these are great questions, but mostly unrelated to this PR. it would be nice to join the discussion in WordPress.org Slack in the #data-liberation channel.

Personal opinion time: I think it's valuable to do a proper job with our existing commitments while also exploring new ideas. This PR I think is working to ensure that we do well the things we say we do, while leaving the door open to other kinds of imports.

@flexseth
Copy link
Collaborator

@dmsnell sounds good! I'll link the PR in the docs on importing and exporting with Playground for future discussions!

adamziel added a commit that referenced this pull request Apr 11, 2024
The test was removed in #1192 due to intermittent failures like these:

```
FAIL  src/lib/steps/site-data.spec.ts > Blueprint step setSiteOptions() > should set the site option
```

They seem related to #1189, but all the other tests pass and the
Playground website works so I wonder what's going on there. Is the CI
runner just running out of memory? But then it shouldn't be able to
allocate the memory segment in the first place. Weird!
adamziel added a commit that referenced this pull request Apr 11, 2024
The test was removed in #1192 due to intermittent failures like these:

```
FAIL  src/lib/steps/site-data.spec.ts > Blueprint step setSiteOptions() > should set the site option
```

![CleanShot 2024-04-04 at 13 09
25@2x](https://github.com/WordPress/wordpress-playground/assets/205419/fe68c5ff-80a7-4966-82a8-fbfcc7e9a101)

They seem related to #1189, but:

* I can't reproduce it locally
* The test actually passed in #1189 and only started failing in #1192
* All the other tests pass and the Playground website works so I wonder
what's going on there.
* The stack trace mentions network activity, but that call shouldn't
involve any. It's almost as if the trace was coming from
`php-networking.spec.ts`, but those tests actually pass.

Is the CI runner just running out of memory? But then it shouldn't be
able to allocate the memory segment in the first place. Weird!

CC @brandonpayton for insights
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants