-
-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
html-only emails allow publishing #690
Comments
I researched and dabbled with it a bit, and I even had an (infuriatingly bad) conversation with ChatGPT about it (ha! the new world!), and I have decided that there is no safe and easy way to strip HTML tags using regex or other simple means. ChatGPT gave me a few examples showing how stripping with regex could be dangerous. Anyway, I looked at bluemonday, and it seems that it doesn't pull in a giant chain of other dependencies, so I think it'll be fine to use it for HTML tag stripping. I'd be happy to accept PRs and/or may do it myself some day when I am bored. One important note about potential PRs: I do think we should prefer text/plain emails over text/html+stripping, which will change the parsing logic a little. |
I'm fine with it. What about the first option - just ignore / delete body text? Implementation of html tag stripping could then be done later. |
I have experimented with bluemonday and a few html emails and the results are absolutely terrible. Even with post processing, the result looks something like this:
Not sure if this is better than having nothing at all. |
See for yourself: #693. Your demo email translates to this after my post-processing (ignore the " +" at the beginning of the lines):
|
One of the problems seems to be that (at least in my example) the charset is utf-8 but the "Content-Transfer-Encoding: quoted-printable" As bluemonday seems not to support such encoding, it may be necessary to convert the text to a "clean / full" utf-8 version before processing. E.g. FRITZ=21Box should be converted to FRITZ!Box And I'm wondering about new characters for the processing of my text. In my example the original text Maybe this can help: How to get a quoted printable string in golang But I see, converting and sanitizing of html is difficult... As I said before I could live with the complete deletion of body text (if it's html-only). |
quoted-printable is transparently stripped out by Go before, so it should not ever be visible by the reader. See https://pkg.go.dev/mime/multipart#Reader.NextPart --
It is odd that the
That seems like a possibility. I may dabble with it a little more, and if I can't get anything good out, I'll do the title thing. |
Are you sure, bluemonday does detect the content-transfer-encoding (at all)? In my example, none of the quoted-printable codes is correctly decoded. Maybe there should be a dedicated "header" (format / tag / declaration) which is missing (at least) in my example. |
Hi! I just wanted to add to this with my use-case for HTML e-mails and Ntfy. I have been working to convert every conceivable device in my home to use Ntfy as my primary notification service. Unfortunately, several items still rely 100% on e-mail notification as their "only" form of notifications, so the SMTP aspect of Ntfy has been a lifesaver (thanks for all the troubleshooting we did over the past few months in Discord & in quickly tackling #610 !) The latest one I am trying to work on is my Synology NAS which has e-mail notifications, but use HTML formatted messages and therefore receive the "554 5.0.0 Error: transaction failed, blame it on the weather: unsupported content type" error. I know this was discussed/closed in #623 , but saw this issue/WIP PR #693 and wanted to add the code that I receive from Synology when I debugged Ntfy with an incoming e-mail. Thanks again and please let me know if there is anything else I can gather that might assist with this.
|
am curious if you want to use goang templates or some other IDL / AST to produce the email. i ask because:
|
Just checking if there's any movement on this or any workarounds. I was ecstatic when I found out about this project which replaced a bunch of telegram bots serving a similar purpose for me. I know nothing about Go so wouldn't know where to start. I did however use this project in the past to forward SMTP emails to the desired telegram channel. Unsure if it provides any pointers on the ContentType issues folks are facing. |
I have decided to merge the original PR and add support for HTML-only emails. It comes up enough to merge in support, even though it is very sub-par. Don't expect too much. But at least mail will not be rejected anymore. See #693 This will be in the next release. |
@Robert-litts I used your demo email in a test and it comes out nicely actually: 859a4e4 |
@binwiederhier Awesome & appreciate the effort on this one. I'm looking forward to testing this out. Thanks again. |
This ist great news, and I can confirm, it works for me, too 🥳 Thank you so much! |
Emails can be used to publish messages via ntfy. But html-only mails are rejected with the following error message "554 5.0.0 Error: transaction failed, blame it on the weather: unsupported content type (in reply to end of DATA command)"
That's obviously for security reasons, which are understandable (potentially active / malicious code hidden in the text).
IIRC, emails consisting of text and html text are processed by ignoring the html part.
But sometimes we cannot change the structure of an email, especially if we want to use emails from home servers, home automation etc. Fritz Box for example can be used for email alerts, but sends html-only mails.
I think, there could be two ways to solve this problem:
a) fall back solution (as for emails consisting of text only and html-text):
Ignore or delete html part, meaning: delete / forget the body text completely. Then only the subject would be left for additional information. This would be completely acceptable to me. The text could be replaced by a warning that it has been removed (to avoid too many questions as to why this was done).
b) remove html tags with the help of regex.
This could be done by a (multiline-) search for "(?s)<.*?>" and replacing it with nothing.
That would be an operation with a sledgehammer, as the text is not preserved completely, especially references / links are removed, too. But this is quite acceptable for me. I'll attach the source code of an email from my fritz box before and after processing (in geany text editor)
1 email source text for test with ntfy.txt
2 email source text for test with ntfy after processing with regex.txt
The text was updated successfully, but these errors were encountered: