-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue in importing WXR files #1211
Labels
Comments
adamziel
added
[Type] Bug
An existing feature does not function as intended
[Feature] Import Export
labels
Apr 8, 2024
Thanks @adamziel ! |
9 tasks
adamziel
added a commit
that referenced
this issue
Dec 11, 2024
…2058) ## Description Adds the Data Liberation WXR importer as an option in the `importWxr` step. The new importer is turned by including the `"importer": "data-liberation"` option: ```json { "steps": [ { "step": "importWxr", "file": { "resource": "url", "url": "https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml" }, "importer": "data-liberation" } ] } ``` When the `importer` option is missing or set to "default," nothing changes in the behavior of the step and it continues using the https://github.com/humanmade/WordPress-Importer importer. The new importer: * Rewrites links in the imported content * Downloads assets through Playground's CORS proxy * Parallelizes the downloads * Communicates progress This PR is a part of #1894 ## Implementation details This `importWxr` step fetches and includes the `data-liberation-core.phar` file. The phar file is built with [Box](https://box-project.github.io/box/configuration/) and contains the importer library with its dependencies, which is a subset of the Data Liberation library, a subset of the Blueprints library, and a few vendor libraries. This, unfortunately, means that any changes in the PHP files require rebuilding the .phar file. Here's how you can do it: ```bash nx build:phar playground-data-liberation ``` You can also build the entire Data Liberation package as a WordPress plugin complete with a wp-admin page: ```bash nx build:plugin playground-data-liberation ``` Both commands will output the built files to `packages/playground/data-liberation/dist` The progress updates are a first-class feature of the new importer. The updated `importer` step receives them in real-time via a `post_message_to_js()` call running after every import step. Then, it passes them on to the progress bar UI. ### Other changes * **TLS traffic now goes through the CORS proxy.** Since the new importer uses `AsyncHTTP\Client` which deals with raw sockets, Playground's [TLS-based network bridge](#1926) runs the outbound traffic through a cors proxy. Technically, `TCPOverFetchWebsocket` gets the `corsProxy` URL passed to the `playground.boot()` call. * A few composer dependencies were forked, downgraded to PHP 7.2 using Rector, and bundled with this PR to keep the Data Liberation importer working. ## Remaining work - [x] PHP 7.2 compatibility. Done by forking and Rector-downgrading dependencies that were incompatible with PHP 7.2. - [x] Report the importer's progress on the overall Blueprint progress bar - [x] Enqueue the data liberation plugin files for downloading at the blueprint compilation stage - [x] Don't eagerly rewrite attachments URLs in `WP_Stream_Importer`. Exposing this information to the API consumer requires an explicit decision. Do we rewrite it? Or do we ignore it? - [x] Fix the TLS errors at the intersection of Playground network transport and the async HTTP client library - [x] Separate the markdown importer and its dependencies (md parser, frontmatter parser, Symfony libraries) from the core plugin - [x] Ship the importer and its tree-shaken deps (URL parser) as a minified zip/phar ## Follow-up work - [ ] Reconsider the `WP_Import_Session` API – do we need so many verbosely named methods? Can we achieve the same outcomes with fewer methods? - [ ] Investigate why there's a significant delay before media downloads start on PHP 7.2 – 7.4. It's likely a PHP.wasm issue. ## Testing instructions * Default importer – [Open this link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20}) and confirm it does what the current `importWxr` step do, that is it stays at "Importing content" for a moment, fails to fetch media files (CORS issues in network tools), but inserts posts and pages. * Data Liberation – [Open this link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22importer%22:%20%22data-liberation%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20}), confirm the import progress is visible and that the content and media indeed get imported: ![CleanShot 2024-12-08 at 14 54 49@2x](https://github.com/user-attachments/assets/a7da3244-a10f-43d2-8e94-43d305220a7e) ## Related issues * #1211 * #2012 * #1477 * #1250 * #1780
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
The content is generated using https://playground.wordpress.net/?theme=twentytwentythree&wp=6.5&php=8.2&plugin=wordpress-importer&plugin=inseri-core where a new post is created. The post is exported in the
issue.xml
.Expected behaviour:
issue.xml
issue.xml
file downloaded previouslyIssue in Query API
import-wxr
The issue is present also in the blueprint.
It used to work fine until the end of last week. fad3ccf might be related to it.
The text was updated successfully, but these errors were encountered: