Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROD] Translated snippets to French #1985

Closed
szabozoltan69 opened this issue Jan 5, 2024 · 4 comments
Closed

[PROD] Translated snippets to French #1985

szabozoltan69 opened this issue Jan 5, 2024 · 4 comments
Assignees

Comments

@szabozoltan69
Copy link
Contributor

szabozoltan69 commented Jan 5, 2024

Issue

If one creates an HTML snippet (in English), and the automatic translation sets the translated fields (es, fr, ar), then the fr (only!) has some strange attributes.
On one hand, it translates css words also. But there is a more dangerous issue, that some quotation marks (") turn to "C2 A0 C2 BB" hex character sequence (' »'), which results broken HTML code. An example:

<span style="couleur : #3366ff ;"><a style="couleur : #3366ff ; » href="https://www.facebook.com/SYRedCrescent »

I tried to catch the moment when the TinyMCE rich text goes to the translation API, but I did not find it. Maybe this is not between the browser client and the translation microservice, but between the GO backend and translation microservice. And maybe TinyMCE sends wrong text to be translated. Any help is appreciated @batpad @thenav56 @k9845

@szabozoltan69
Copy link
Contributor Author

szabozoltan69 commented Jan 5, 2024

Maybe it is related to some Rich text localization feature or some MS issue, when there are embedded quotation marks, like " " " ", then the inner ones turn to a "more readable" quotation mark (' »') inside?
Such character sequences can be found often in:
/usr/local/lib/python3.8/site-packages/django/contrib/, e.g. in admin/locale/fr/LC_MESSAGES/django.po:
msgstr "Ajout de {name} « {object} »."

Some points to be started at (thanks @batpad):
https://learn.microsoft.com/en-us/azure/ai-services/translator/prevent-translation
https://learn.microsoft.com/en-us/azure/ai-services/translator/reference/v3-0-translate
https://github.com/IFRCGo/go-api/blob/develop/lang/tasks.py#L41
https://github.com/IFRCGo/go-api/blob/develop/lang/translation.py

@szabozoltan69
Copy link
Contributor Author

szabozoltan69 commented Jan 6, 2024

Probably we've forgotten to include "textType": "html" as parameter (so not header info).
So into the payload: https://github.com/IFRCGo/go-api/blob/develop/lang/translation.py#L92 let's put textType.

curl --location 'https://microservices-staging.ifrc.org/TranslationV2_API/api/Home/Translate?apiKey=...' \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: ...' \
--data '{"text": "<p><a href=\"https://prddsgofilestorage.blob.core.windows.net/api/documents/Tunisia_-_Climate_Fact_Sheet/TUNISIA_Climate_Fact_Sheet_EN.pdf\" data-jzz-gui-player=\"true\"><img src=\"https://prddsgofilestorage.blob.core.windows.net/api/tinymce/country173snippet.png\" alt=\"\" /></a></p>", "to": "fr", "textType": "html"}'|jq

[
  {
    "detectedLanguage": {
      "language": "en",
      "score": 0
    },
    "translations": [
      {
        "text": "<p><a href=\"https://prddsgofilestorage.blob.core.windows.net/api/documents/Tunisia_-_Climate_Fact_Sheet/TUNISIA_Climate_Fact_Sheet_EN.pdf » data-jzz-gui-player=\"true\"><img src=\"https://prddsgofilestorage.blob.core.windows.net/api/tinymce/country173snippet.png » alt=\" » /></a></p>",
        "to": "fr"
      }
    ]
  }
]

curl --location 'https://microservices-staging.ifrc.org/TranslationV2_API/api/Home/Translate?apiKey=...' \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: ...' \
--data '{"text": "<p><a href=\"https://prddsgofilestorage.blob.core.windows.net/api/documents/Tunisia_-_Climate_Fact_Sheet/TUNISIA_Climate_Fact_Sheet_EN.pdf\" data-jzz-gui-player=\"true\"><img src=\"https://prddsgofilestorage.blob.core.windows.net/api/tinymce/country173snippet.png\" alt=\"\" /></a></p>", "to": "fr", "textType": "html"}'|jq

[
  {
    "detectedLanguage": {
      "language": "en",
      "score": 0
    },
    "translations": [
      {
        "text": "<p><a href=\"https://prddsgofilestorage.blob.core.windows.net/api/documents/Tunisia_-_Climate_Fact_Sheet/TUNISIA_Climate_Fact_Sheet_EN.pdf\" data-jzz-gui-player=\"true\"><img src=\"https://prddsgofilestorage.blob.core.windows.net/api/tinymce/country173snippet.png\" alt=\"\" /></a></p>",
        "to": "fr"
      }
    ]
  }
]

@szabozoltan69
Copy link
Contributor Author

szabozoltan69 commented Jan 7, 2024

If there is a normal (not-in-tag) text in the sent string, it gets translated:

 curl --location 'https://microservices-staging.ifrc.org/TranslationV2_API/api/Home/Translate?apiKey=...' \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: ...' \
--data '{"text": "<p><a href=\"https://prddsgofilestorage.blob.core.windows.net/api/documents/Tunisia_-_Climate_Fact_Sheet/TUNISIA_Climate_Fact_Sheet_EN.pdf\" data-jzz-gui-player=\"true\"><img src=\"https://prddsgofilestorage.blob.core.windows.net/api/tinymce/country173snippet.png\" alt=\"\" /></a>Also there is some normal text to be really translated</p>", "to": "fr", "textType": "html"}'|jq
[
  {
    "detectedLanguage": {
      "language": "en",
      "score": 1
    },
    "translations": [
      {
        "text": "<p><a href=\"https://prddsgofilestorage.blob.core.windows.net/api/documents/Tunisia_-_Climate_Fact_Sheet/TUNISIA_Climate_Fact_Sheet_EN.pdf\" data-jzz-gui-player=\"true\"><img src=\"https://prddsgofilestorage.blob.core.windows.net/api/tinymce/country173snippet.png\" alt=\"\" /></a>De plus, il y a du texte normal à traduire</p>",
        "to": "fr"
      }
    ]
  }
]

@szabozoltan69
Copy link
Contributor Author

Navin's universal fix solved this also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant