Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doctrine Insert with XmlEntry #1472

Closed
jmortlock opened this issue Feb 12, 2025 · 6 comments · Fixed by #1475
Closed

Doctrine Insert with XmlEntry #1472

jmortlock opened this issue Feb 12, 2025 · 6 comments · Fixed by #1475

Comments

@jmortlock
Copy link
Contributor

Hi

If i try and insert a dataframe into a database that contains an XmlEntry, I get the following error:

Error: Object of class DOMDocument could not be converted to string

/var/www/html/zg/vendor/doctrine/dbal/src/Driver/PDO/Statement.php:48
/var/www/html/zg/vendor/doctrine/dbal/src/Connection.php:1809
/var/www/html/zg/vendor/doctrine/dbal/src/Connection.php:1205
/var/www/html/zg/vendor/doctrine/dbal/src/Connections/PrimaryReadReplicaConnection.php:292
/var/www/html/zg/vendor/flow-php/doctrine-dbal-bulk/src/Flow/Doctrine/Bulk/Bulk.php:42
/var/www/html/zg/vendor/flow-php/etl-adapter-doctrine/src/Flow/ETL/Adapter/Doctrine/DbalLoader.php:67
/var/www/html/zg/vendor/flow-php/etl/src/Flow/ETL/Pipeline/SynchronousPipeline.php:65
/var/www/html/zg/vendor/flow-php/etl/src/Flow/ETL/DataFrame.php:763

Some pseduo code

{"descriptionHtml": "<b>This is a test</b>"}
       return data_frame()
            ->read(from_json_lines("sample.json")
            ->write(to_dbal_table_insert(.....)

I had a quick look at submitting a fix, as I managed to hack the v0.10.0 code to get it to work, however looks like I need a postgress db to run the test-suite which i don't currently have installed, and i suspect this is already fixed or easy enough for you.

I believe you need to handle it here

'object' => match ($entry::class) {

I believe something like this

\DOMDocument::class => $entry->saveXML($entry->documentElement),

As i side note are they are workarounds you can think of to transform the datatype in the row?

Closest I got was

->withEntry('DESCRIPTION_HTML', ref('DESCRIPTION_HTML')->domElementValue())

But that strips out the html tags, which ideally i want to keep.

@norberttech
Copy link
Member

norberttech commented Feb 12, 2025

Hey!
What you are doing with ref()->domElementValue() is the correct approach.
I'm not sure why it's stripping HTML tags, as this function just takes nodeValue.
Could you provide some example of this behavior so I can look into it?

As for automated casting XMLEntry to string, we would need to introduce something called EntryNormalizer and call it from the Loader like we do in JSON or CSV for example. However, there is a trick part here: we are using dbal which can handle both objects and primitives through Types.
Casting on DataBulk level was never a good idea in the first place. I need to think how to migrate out of it.

So in theory, this could be solved by registering XMLType in Dbal.

That's why I'm not entirely convinced that introducing EntryNormalizer to Doctrine Adapter is a good idea.

@norberttech
Copy link
Member

oh and for the not installed postgres, you should use docker to setup local services.
I recently updated project contributing guidelines: https://flow-php.com/documentation/contributing/

You should find there everything you need to prepare local development environment.

@jmortlock
Copy link
Contributor Author

Ahh thanks to the pointer to the docs, i got it up and running locally easily enough.

  1. I've pushed up a PR that has a failing example for XML entry => Doctrine, which replicates what I am doing locally.
  2. With the domElementValue I found this existing test
    public function test_dom_element_value_from_dom_document() : void
    so not what i want in this instance. Maybe a new function domDocumentValue could be useful.

I have not tried to fix the issue, but registering a new type sounds like the way to go and I may experiment with that in my local codebase to see how it goes.

@norberttech
Copy link
Member

norberttech commented Feb 12, 2025

so not what i want in this instance. Maybe a new function domDocumentValue could be useful.

hmm but DOMElementValue should take $node->documentElement if it detects DOMDocument. I would try to look into existing function and try to figure out why it's removing html tags.

I have not tried to fix the issue, but registering a new type sounds like the way to go and I may experiment with that in my local codebase to see how it goes.

But regardless we should be able to cast XMLEntries to simple strings without losing html tags. Could you maybe try to use ref('xml')->cast(type_string()) instead of DOMElementvalue?

@jmortlock
Copy link
Contributor Author

I have updated tests to show examples of both the domElementValue and the cast approaching, both do not give the expected results.

@norberttech
Copy link
Member

Thank you, will fix it late today!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants