Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline <p> not handled well #92

Closed
mirabilos opened this issue Jul 7, 2023 · 1 comment · Fixed by #120
Closed

Inline <p> not handled well #92

mirabilos opened this issue Jul 7, 2023 · 1 comment · Fixed by #120

Comments

@mirabilos
Copy link

Input:

Wie nennt man jemanden, der gegen Covid-Politik Proteste organisiert, und sich dann mit einem Covid-Testcenter an der Zitze des Staates labt?<p><a href="https://www.mdr.de/nachrichten/sachsen-anhalt/dessau/bitterfeld/buergermeister-raguhn-jessnitz-loth-afd-100.html">Bürgermeister und AfD-Abgeordneter</a>.

Proposed patch:

-        return '%s\n\n' % text if text else ''
+        return '\n\n%s\n\n' % text if text else ''

Unsure if the newlines after actually need to be there, but, sure, why not. People have to postprocess the output of this to clean up newlines anyway.

@chrispy-snps
Copy link
Collaborator

Smaller testcase:

>>> from markdownify import markdownify as md
>>> md('TEXT1<p>TEXT2</p>')
'TEXT1TEXT2\n\n'

There is no line break before the TEXT2 paragraph content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants