Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate frequent translation strings #29823

Closed
wants to merge 22 commits into from

Conversation

denyskon
Copy link
Member

@denyskon denyskon commented Mar 15, 2024

As title. Cases where we get fancy with constructing the string are excluded for now, as are most strings which contain more detail, e. g. "Edit" is included but "Edit File" is not.

Per commit, only one string or few less frequent strings are changed to simplify review.

Part of #27700 , #24488

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 15, 2024
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 15, 2024
@github-actions github-actions bot added modifies/translation modifies/go Pull requests that update Go code modifies/templates This PR modifies the template files labels Mar 15, 2024
@denyskon denyskon added the type/refactoring Existing code has been cleaned up. There should be no new functionality. label Mar 15, 2024
@pull-request-size pull-request-size bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 15, 2024
@denyskon denyskon marked this pull request as ready for review March 15, 2024 20:05
@@ -9,7 +9,7 @@
</a>
{{end}}
<a class="{{if .PageIsExploreOrganizations}}active {{end}}item" href="{{AppSubUrl}}/explore/organizations">
{{svg "octicon-organization"}} {{ctx.Locale.Tr "explore.organizations"}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a personal viewpoint (#29816): Oh no, a merge conflict 🥲

@@ -1281,7 +1246,6 @@ editor.file_delete_success = File "%s" has been deleted.
editor.name_your_file = Name your file…
editor.filename_help = Add a directory by typing its name followed by a slash ('/'). Remove a directory by typing backspace at the beginning of the input field.
editor.or = or
editor.cancel_lower = Cancel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did a lower ever end up in the translations?
And why isn't it even lowercased?

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Mar 15, 2024
@lafriks
Copy link
Member

lafriks commented Mar 15, 2024

But that will make it untranslatable correctly and that defeats the whole point of translating. It would be the same as proposing to replace all occurrences in English language for repositories to just single repository it would just look plainly wrong

Copy link
Member

@lafriks lafriks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking for now as this really needs to be reviewed more from the point of other languages

@GiteaBot GiteaBot added lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Mar 15, 2024
@denyskon
Copy link
Member Author

But if we really want to make it correct from every grammatical point, we would need a new translation key for every scope where it is used. We have private in the context of users, orgs, repos, projects, email addresses, members.... we simply cannot maintain an own translation for every single one of these things. What would be the alternative? We cannot give a word a gender (that's the issue if I understand correctly) because it changes depending on the language.

Does Latvian for example really not contain a way to express this state neutrally? In Ukrainian or Russian there's the same problem, but you can avoid it by using an adverb instead of the adjective. Yes it sounds weird, but you can get used to it.

Alternatively you can translate with braces like Publisk(a) in your case.

@silverwind
Copy link
Member

silverwind commented Mar 15, 2024

I totally agree to dedupe down to one value. It will definitely be a good enough translation still. If by chance one of the reviewer notices a case where context helps, we can keep these and mark the dupe ones with a comment. But we can't possible know all these cases in advance and there is already some "sharing" going on as well.

@silverwind
Copy link
Member

silverwind commented Mar 15, 2024

I will add some basic "linting" capabilities to the script in #29825 which will be a good validation to run this PR against to detect unused or missing translations.

@lafriks
Copy link
Member

lafriks commented Mar 15, 2024

But if we really want to make it correct from every grammatical point, we would need a new translation key for every scope where it is used. We have private in the context of users, orgs, repos, projects, email addresses, members.... we simply cannot maintain an own translation for every single one of these things. What would be the alternative? We cannot give a word a gender (that's the issue if I understand correctly) because it changes depending on the language.

Does Latvian for example really not contain a way to express this state neutrally? In Ukrainian or Russian there's the same problem, but you can avoid it by using an adverb instead of the adjective. Yes it sounds weird, but you can get used to it.

Alternatively you can translate with braces like Publisk(a) in your case.

Latvian language does not have neutral way of expressing such cases and while it would be possible to write Publisks(-a) such form are pretty much never used because for some words it's not possible to do as not only last letter changes.

@silverwind
Copy link
Member

silverwind commented Mar 15, 2024

Latvian language does not have neutral way of expressing such cases and while it would be possible to write Publisks(-a) such form are pretty much never used because for some words it's not possible to do as not only last letter changes.

On a scale of 1 (nit) to 10 (meaning gets lost), how bad is the error? Certain minor issues have to be accepted imho.

Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different languages have different words, some words can't be simply "deduplicated"

The last time it was wrong for "color": Translation wrong for word color in PR #16647 #17384 & Re-separate the color translation strings #17390

@denyskon
Copy link
Member Author

Then please propose a solution

@lafriks
Copy link
Member

lafriks commented Mar 16, 2024

Imho adjectives most probably cannot be deduplicated to keep context, also cannot deduplicate where one is verb (action), other noun (label) even if both in English are written the same

@denyskon
Copy link
Member Author

I agree with the second one, but we cannot create a new adjective translation for every case where it is used. If they exist currently, it's mostly just by accident

@denyskon
Copy link
Member Author

denyskon commented Mar 16, 2024

Compromise for now - I revert the commits for everything that is used as adjective/verb, we can at least merge the non-confusing nouns, and then we think of a solution for the other ones

@@ -23,7 +23,7 @@ func (o OwnerType) LocaleString(locale translation.Locale) string {
case OwnerTypeRepository:
return locale.TrString("concept_code_repository")
case OwnerTypeOrganization:
return locale.TrString("concept_user_organization")
return locale.TrString("organization")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why I used concept_user_organization here, the reason is the same as the previous discussion: I believe we should clearly define the concepts with context. I think "organization" could have different means in different contexts in different languages.

  • One context, one concept, one translation.
  • Different context, different concept, different translation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I increasingly think we need to restructure translations completely. Not group them by usage (new issue page, admin user table, installation page) but by context or meaning ([permissions], [operations], [repo_units] and so on)

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Mar 16, 2024

I still think there are unclear concepts.

For example: "dashboard". I can see in Chinese translation, it was translated to "Home page". It is right in Chinese when the "dashboard" is used in the home page context. But it is wrong when it is used in the admin pages. But this PR "deduplicates" them .... it leads to wrong result.

To answer the question "Then please propose a solution":

My opinion is: "only take actions when it matches at least one of these conditions":

  1. If there is a real problem and it really affects end users (say, a bug).
  2. There are some native speakers and language experts clearly know how to do it.
  3. We clearly know all the details and know how to do it.

My understanding about how to make the translations correct:

  • One context, one concept, one translation.
  • Different context, different concept, different translation.
  • Do not over-abstract.

For this PR's change, I think it doesn't aim to resolve bugs, and we haven't got a full picture of all language translations.

@denyskon
Copy link
Member Author

@go-gitea/technical-oversight-committee

I think we need an opinion from the TOC here

@silverwind
Copy link
Member

Imho some people just need to go through the modifications in the ini and highlight the cases where deduplication is not possible and then we split those up again. Just a normal review process imho. Just saying "no" to the entire PR is not constructive feedback.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Mar 16, 2024

Imho some people just need to go through the modifications in the ini and highlight the cases where deduplication is not possible and then we split those up again. Just a normal review process imho. Just saying "no" to the entire PR is not constructive feedback.

I'd like to, but I only understand English and Chinese, and I have shown the real problem (eg: "dashboard"). So I can't comment for other languages. Actually I would prefer to insist my suggestion in #29823 (comment) .

@jolheiser
Copy link
Member

We may need wider feedback on this, or to take inspiration from other projects.
I agree with the notion of reducing translation complexity, but also know that it's never as simple as deduping.

Do any of our translators have a suggestion on the matter, how it could be made easier?
If it's to reorganize them by context, like mentioned above?

Would weblate make this easier or harder?

Glad this PR sparked the conversation, but we'll want to make sure we have an agreed solution before we merge.

@denyskon denyskon closed this Aug 15, 2024
@denyskon denyskon deleted the locale-cleanup-2 branch August 18, 2024 13:43
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Nov 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged modifies/go Pull requests that update Go code modifies/templates This PR modifies the template files modifies/translation size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. type/refactoring Existing code has been cleaned up. There should be no new functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants