Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i18n brainstorming #553

Open
Rich-Harris opened this issue Feb 23, 2019 · 247 comments
Open

i18n brainstorming #553

Rich-Harris opened this issue Feb 23, 2019 · 247 comments
Labels
feature / enhancement New feature or request p1-important SvelteKit cannot be used by a large number of people, basic functionality is missing, etc. size:large significant feature with tricky design questions and multi-day implementation
Milestone

Comments

@Rich-Harris
Copy link
Member

Rich-Harris commented Feb 23, 2019

We've somewhat glossed over the problem of internationalisation up till now. Frankly this is something SvelteKit isn't currently very good at. I'm starting to think about how to internationalise/localise https://svelte.dev, to see which parts can be solved in userland and which can't.

(For anyone unfamiliar: 'Internationalisation' or i18n refers to the process of making an app language agnostic; 'localisation' or l10n refers to the process of creating individual translations.)

This isn't an area I have a lot of experience in, so if anyone wants to chime in — particularly non-native English speakers and people who have dealt with these problems! — please do.

Where we're currently at: the best we can really do is put everything inside src/routes/[lang] and use the lang param in preload to load localisations (an exercise left to the reader, albeit a fairly straightforward one). This works, but leaves a few problems unsolved.

I think we can do a lot better. I'm prepared to suggest that SvelteKit should be a little opinionated here rather than abdicating responsibility to things like i18next, since we can make guarantees that a general-purpose framework can't, and can potentially do interesting compile-time things that are out of reach for other projects. But I'm under no illusions about how complex i18n can be (I recently discovered that a file modified two days ago will be labeled 'avant-hier' on MacOS if your language is set to French; most languages don't even have a comparable phrase. How on earth do you do that sort of thing programmatically?!) which is why I'm anxious for community input.


Language detection/URL structure

Some websites make the current language explicit in the pathname, e.g. https://example.com/es/foo or https://example.com/zh/foo. Sometimes the default is explicit (https://example.com/en/foo), sometimes it's implicit (https://example.com/foo). Others (e.g. Wikipedia) use a subdomain, like https://cy.example.com. Still others (Amazon) don't make the language visible, but store it in a cookie.

Having the language expressed in the URL seems like the best way to make the user's preference unambiguous. I prefer /en/foo to /foo since it's explicit, easier to implement, and doesn't make other languages second-class citizens. If you're using subdomains then you're probably running separate instances of an app, which means it's not SvelteKit's problem.

There still needs to be a way to detect language if someone lands on /. I believe the most reliable way to detect a user's language preference on the server is the Accept-Language header (please correct me if nec). Maybe this could automatically redirect to a supported localisation (see next section).

Supported localisations

It's useful for SvelteKit to know at build time which localisations are supported. This could perhaps be achieved by having a locales folder (configurable, obviously) in the project root:

locales
|- de.json
|- en.json
|- fr.json
|- ru.json
src
|- routes
|- ...

Single-language apps could simply omit this folder, and behave as they currently do.

lang attribute

The <html> element should ideally have a lang attribute. If SvelteKit has i18n built in, we could achieve this the same way we inject other variables into src/template.html:

<html lang="%svelte.lang%">

Localised URLs

If we have localisations available at build time, we can localise URLs themselves. For example, you could have /en/meet-the-team and /de/triff-das-team without having to use a [parameter] in the route filename. One way we could do this is by encasing localisation keys in curlies:

src
|- routes
   |- index.svelte
   |- {meet_the_team}.svelte

In theory, we could generate a different route manifest for each supported language, so that English-speaking users would get a manifest with this...

{
  // index.svelte
  pattern: /^\/en\/?$/,
  parts: [...]
},

{
  // {meet_the_team}.svelte
  pattern: /^\/en/meet-the-team\/?$/,
  parts: [...]
}

...while German-speaking users download this instead:

{
  // index.svelte
  pattern: /^\/de\/?$/,
  parts: [...]
},

{
  // {meet_the_team}.svelte
  pattern: /^\/de/triff-das-team\/?$/,
  parts: [...]
}

Localisation in components

I think the best way to make the translations themselves available inside components is to use a store:

<script>
  import { t } from '$app/stores';
</script>

<h1>{$t.hello_world}</h1>

Then, if you've got files like these...

// locales/en.json
{ "hello_world": "Hello world" }
// locales/fr.json
{ "hello_world": "Bonjour le monde" }

...SvelteKit can load them as necessary and coordinate everything. There's probably a commonly-used format for things like this as well — something like "Willkommen zurück, $1":

<p>{$t.welcome_back(name)}</p>

(In development, we could potentially do all sorts of fun stuff like making $t be a proxy that warns us if a particular translation is missing, or tracks which translations are unused.)

Route-scoped localisations

We probably wouldn't want to put all the localisations in locales/xx.json — just the stuff that's needed globally. Perhaps we could have something like this:

locales
|- de.json
|- en.json
|- fr.json
|- ru.json
src
|- routes
   |- settings
      |- _locales
         |- de.json
         |- en.json
         |- fr.json
         |- ru.json
      |- index.svelte

Again, we're in the fortunate position that SvelteKit can easily coordinate all the loading for us, including any necessary build-time preparation. Here, any keys in src/routes/settings/_locales/en.json would take precedence over the global keys in locales/en.json.

Translating content

It's probably best if SvelteKit doesn't have too many opinions about how content (like blog posts) should be translated, since this is an area where you're far more likely to need to e.g. talk to a database, or otherwise do something that doesn't fit neatly into the structure we've outlined. Here again, there's an advantage to having the current language preference expressed in the URL, since userland middleware can easily extract that from req.path and use that to fetch appropriate content. (I guess we could also set a req.lang property or something if we wanted?)

Base URLs

Sapper (ab)used the <base> element to make it easy to mount apps on a path other than /. <base> could also include the language prefix so that we don't need to worry about it when creating links:

<!-- with <base href="de">, this would link to `/de/triff-das-team` -->
<a href={$t.meet_the_team}>{$t.text.meet_the_team}</a>

Base URLs haven't been entirely pain-free though, so this might warrant further thought.


Having gone through this thought process I'm more convinced than ever that SvelteKit should have i18n built in. We can make it so much easier to do i18n than is currently possible with libraries, with zero boilerplate. But this could just be arrogance and naivety from someone who hasn't really done this stuff before, so please do help fill in the missing pieces.

@AlexxNB
Copy link

AlexxNB commented Feb 23, 2019

I've some thoughts about it...

Plurals

Should be way to use any plural forms of the phrase based on number value.

Something like...
html <p>{$t.you_have} {number} {$t.message(number)}</p>

// locales/en.json
{
    you_have: "You have",
    message:["message","messages"]
}

// locales/ru.json
{
     you_have: "У вас",
    message: ["сообщение","сообщения","сообщений"]
}

And we should know plural rules for all languages of the world =)

Formatting

For example, americans write date this way — 02/23, russians — 23.02. There are a lot of things like currency and so on that may be formatted.

Language detection/URL structure

I'd prefer to use third-level domain like ru.svelte.technology but with single app with i18l support. I understand that it require server configuration, that common Sapper user may not to know how to do. Maybe we could have config option to choose URL path based language detection or URL domain based language detection.

Localised URLs

Don't do it. I never seen i18n system with supporting it. There are additional problems like checking URLs for unsupported symbols, complexity of the routes and code, broken links and so on. It is my personal opinion.

Localisation in components

Maybe we can find better way and replace this...
html <p>{$t.welcome_back(name)}</p>
by
html <p>{$t(`Welcome back ${name}`)}</p>

Developer should be able to use default language visually. It will make content creation simpler for the author.
But I have no idea about perfect implementation...

Alternative

My position that Sapper doesn't need built in i18n system at all. Only a small fraction of all websites needs to be multilanguage. We can make separate Sapper plugin for i18n providing, or we can use one of many existing libs.

PS: Sorry for my poor english...

@chris-morgan
Copy link

Some thoughts:

  1. t should not be a store (though it’s reasonable for the current locale to be a store): switching languages is an exceedingly rare operation (in almost all apps I’d expect it to be used in well under one in ten thousand sessions—the default is probably right, and very few ever change locale, and those that do so probably only need to once, because you then store their preferred language, right? right?), and it being a store adds a fair bit of bloat. Rather, switching locales should just throw away everything (sorry about your transient state, but switching languages is effectively moving to a different page) and redraw the root component.

  2. You need the concept of “such-and-such a URL, but in a different locale”, for things like language switchers and the equivalent meta tags.

  3. Localised slugs are conceptually and generally aesthetically nice, but risky. If the locale is in the URL, you would still ideally handle localised forms of slugs, for naive users and possibly tools that try just swapping the locale in the URL (viz. /en/triff-das-team redirects to /en/meet-the-team). You will observe that points two and five also contain serious risks on localised slugs, where they mean that you must know the localised form of the slugs in all locales, regardless of what locale you’re in at present.

  4. Locale-in-URL is not the only thing you may want: it’s common to encode the country in there too (or instead), and have that matter for more than just language strings. For example, /en-au/ might switch to show Australian products and prices, as well as hopefully talking about what colour the widgets you can buy are, but let’s be honest, they probably didn’t actually make an en-au locale, so it’ll probably just be using the en locale which will be inexplicably American-colored despite not being called en-us. But I digress.

  5. What you’ve described sounds to be designed for web sites rather than web apps.

    • Sites: including the locale in the URL is generally the right thing to do, although it can be a pain when people share links and they’re in the wrong language. If the user changes language, you probably still want to save that fact in a cookie or user database, so that the site root chooses the right language. (Otherwise users of non-default languages will be perpetually having to switch when they enter via search engines.) You should ideally still support locale-independent URLs, so that if I drop the en/ component of the URL it’ll pick the appropriate locale for the user. I believe setting that as the canonical URL for all the locale variants will help search engines too, and thus users so that they’re not constantly taken back to the English form if they wanted Norwegian, but I have no practical experience with the same (my actual i18n experience is only in apps).

    • Apps: including the locale in the URL is generally the wrong thing to do; you will instead want to keep track of each user’s language per account; localised slugs are right out for apps in general, too.

@AlexxNB
Copy link

AlexxNB commented Feb 23, 2019

Route-scoped localisations

Maybe better way is component-scoped localizations? Like in Vue-i18n plugin.
But any scoped localizations make it difficult to support by translators.

@Rich-Harris
Copy link
Member Author

Thanks for this input, it's very valuable.

Just to clarify my thinking a bit: if we have locales in .json files, Sapper can turn those at build time into JavaScript modules. This is what I mean when I say that we're potentially able to do things in Sapper — 'precompiled' translations with treeshaken i18n helpers — that might be a little difficult with more general-purpose tools. So:

Plurals

This locale file...

// locales/ru.json
{
  you_have: "У вас",
  message: ["сообщение","сообщения","сообщений"]
}

could be compiled into something like this:

import { create_plural } from '@sveltejs/i18n/ru';

export default {
  you_have: 'У вас',
  message: create_plural(['сообщение', 'сообщения', 'сообщений'])
};

create_plural encodes all the (somewhat complex, I just learned! 😆 ) pluralisation rules for the Russian language.

(Having said that, the examples on https://www.i18njs.com point towards using more complete sentences as keys, i.e. "You have %n messages" rather than "you_have" and "message".)

In an ideal world, someone would have already encoded all the different pluralisation rules already in a way that we can just reuse. I don't know if that's the case.

Formatting

I wonder if that can be done with symbols like %c and %n and %d?

// locales/fr.json
{
  "Your payment of %c for %n widgets is due on %d": 
    "Votre paiement de %c pour %n widgets est dû le %d."
}
import { format_currency, format_date } from '@sveltejs/i18n/fr';

export default {
  'Your payment of %c for %n widgets is due on %d': (c, n, d) =>
    `Votre paiement de ${format_currency(c)} pour ${n} widgets est dû le ${format_date(d)}.`
};

(Am glossing over the pluralisation of 'widgets' and the nuances of date formatting — '1 April' vs '1 April 2019' vs 'tomorrow' or 'is overdue' — but you get the general thrust.)

Localised URLs

Don't do it. I never seen i18n system with supporting it.

I had a conversation on Twitter recently providing one data point to the contrary. I think you're right that it's very rare, though I have to wonder if that's because of the sheer difficulty of it with existing tools (i.e. the same reason it can't be done in userland in Sapper). Django supports it — see the example of /en/news/category/recent/ vs /nl/nieuws/categorie/recent/.

At the very least, if we did it it would be opt-in — you just need to choose between meet-the-team.svelte or {meet_the_team.svelte}.

Localisation in components

Developer should be able to use default language visually.

Yeah, I think this is probably true, though I wonder if it makes it harder to keep localisations current. Anyway, I did have one idea about usage — maybe there's a clever way to use tagged template literals:

<p>{$t`Welcome back ${name}`}</p>

Maybe better way is component-scoped localizations?

We shouldn't rule it out. Separate .json files would definitely make the implementation easier though...

PS: Sorry for my poor english...

Ваш английский лучше моего русского 😀

@chris-morgan

it being a store adds a fair bit of bloat

Can you expand on that? Sapper is already using stores so there's no additional weight there — do you mean the component subscriptions? I guess it could be t instead of $t, it'd mean we'd need a way to force reload on language change rather than using normal client-side navigation.

You need the concept of “such-and-such a URL, but in a different locale”, for things like language switchers and the equivalent meta tags.

Interesting... can you elaborate? I take it a <link rel="canonical"> is insufficient? Definitely preferable to avoid a situation where every locale needs to know every slug for every other locale.

For example, /en-au/ might switch to show Australian products and prices

I guess this is where precompilation could come in handy — we could generate en-us, en-gb and en-au from en.json by just swapping out the currency formatters or whatever (though you'd need a way to say 'Sapper, please support these countries'). Maybe the existence of an en-au.json locale file would be enough for that; any missing keys would be provided by en.json:

// locales/en.json
{
  "Hello": "Hello",
  "Welcome back, %s": "Welcome back, %s"
}
// locales/en-au.json — the existence of this file causes a module to be created
// that uses Australian currency formatter etc
{
  "Hello": "G'day mate"
  // other keys fall back to en.json
}

What you’ve described sounds to be designed for web sites rather than web apps.

Yeah, I can see that. I guess it could be as simple as an option — sapper build --i18n-prefix for /xx/foo, or sapper build --no-i18n-prefix for /foo, or something.


A few things that didn't occur to me last night:

Constant URLs

Not every URL should be localised — static assets, for example, but also probably some pages and server routes. Maybe need a way to distinguish between them.

RTL languages

No idea what's involved here.

SEO

Some good information here.


Looking at all this, I can certainly see why someone would say 'this is too much complexity, it should be handled by third party tools'. On the contrary I think that's probably why it should be handled by Sapper — it's a hard problem that is very difficult to deal with entirely in userland, and which is prone to bloaty solutions.

@rtorr
Copy link

rtorr commented Feb 23, 2019

One thing that has been helpful in some of my implementations is partial localization. If a key does not exist in one language, it will fallback to (in my case) English.

@AlexxNB
Copy link

AlexxNB commented Feb 23, 2019

In an ideal world, someone would have already encoded all the different pluralisation rules already in a way that we can just reuse. I don't know if that's the case.

Rules are described by Mozilla — https://developer.mozilla.org/en-US/docs/Mozilla/Localization/Localization_and_Plurals . But I don't think they are cover all possible languages. So ability to make custom plural function will be very useful for natives from lost pacific islands.

Reading whole Mozilla's Localization section may give some other thoughts about i18n,l10n and even l12y.

maybe there's a clever way to use tagged template literals

wow, it's look amazing!

I wonder if it makes it harder to keep localisations current

Maybe we can store strings as JSON property names? And autogenerate default language json file when compiling. Then translators can look for differs in other json files.

// locales/en.json
{
	"Military-grade progressive web apps, powered by Svelte": "Military-grade progressive web apps, powered by Svelte",
	"You are in the army now, ${name}": "You are in the army now, ${name}"
}

// locales/ru.json
{
	"Military-grade progressive web apps, powered by Svelte": "Прогрессивные веб-приложения военного качества на платформе Svelte",
	"You are in the army now, ${name}": "Теперь ты в армии, ${name}"
}

@trbrc
Copy link

trbrc commented Feb 23, 2019

(For anyone unfamiliar: 'Internationalisation' or i18n refers to the process of making an app language agnostic; 'localisation' or l10n refers to the process of creating individual translations.)

This might come across as a little pedantic, but localisation and translation aren't quite the same thing, even though the terms are often used interchangeably. Localisation is about adapting to a specific region (locale), while translation is about adapting to a specific language. One locale might need multiple languages, and one language can exist in multiple locales.

But I don't think it's correct to have a folder locales and with files named after language codes. A better folder name would be languages or translations.

In practice, localisation might involve any kind of change to a website, or no change at all. So I think it has to be handled with feature flags or other configuration.

@thgh
Copy link

thgh commented Feb 23, 2019

Thoughts:

A fallback language stack. If a key is missing, it could look in the next language file and finally fallback to the language key itself. For example: es || fr || en || key. Another advantage is that developers don't have to come up with custom translation keys, and can instead write {$t('Meet the team')}. From my experience, developers are terrible in choosing translation keys, like button_1, button_2, button_3, ...

How about a directive? {#t 'Welcome back' name}

It would be pretty awesome if there was a hook that would allow to send new strings to an automatic translation service. If that's too much, perhaps a list of translation keys in the manifest?

By the way, I'm using this translation store in a project:

import { derive, readable } from 'svelte/store'

export const lang = readable(set => set(process.browser && localStorage.lang || 'nl'))

export const t = derive(lang, lang => translations[lang])

export const translations = {
  nl: {
    meet: 'ons team'
  },
  en: {
    meet: 'meet the team'
  }
}

Definitely not ideal. I would prefer if there were a $session or $cookie store that I could derive from.

@arxpoetica
Copy link
Member

arxpoetica commented Feb 23, 2019

Having worked extensively on projects that involve multiple translations (I was involved for many years on building a platform that ingested language files from institutions like Oxford Publishing with internationalized projects and so forth), I can say with certainty this is a massive rabbit hole.

That said, I applaud the effort and 100% support it.

Re: Localized URLs, I'm firmly on the side of the fence that it should be opt-in. I can see both scenarios wanting to use it and not. A good chunk of the time I won't want URLs saying anything about language, but something where language is explicit to the business or organizational (or individual) aims of a website, sometimes it will be wanted in the URL. #depends

@arxpoetica
Copy link
Member

I really like this direction, whatever form it ends up taking, and as long as it's opt in (it appears that it would be):

<p>{$t`Welcome back ${name}`}</p>

@germanftorres
Copy link

I think i18n should be designed in sveltejs/sapper as a common ground between translation features and developer features and, yes, all of them are opinionated!

From the translation oriented features, I think it's worth to take look at https://projectfluent.org/ as well as ICU message format ( http://userguide.icu-project.org/formatparse/messages ). They have put a lot of effort in designing a dsl language to keep translation logic out of application logic. Maybe just swapping json keys in translation files is too simplistic for the inherent complexity of language.

# Translation file en/messages.ftl
unread-emails =
    You have { $emails_count ->
        [0] no unread emails
        [one] one unread email
       *[other] { $emails_count } unread emails
    }.

<p>{ $t('unread-emails', { email_count: userUnreadEmails } ) }</p>
<script>
  import { t } from 'sapper/svelte';
  export let userUnreadEmails;
</script>

It would be great to see sveltejs/sapper support out of the box one of these formats/libraries, maybe opting-in.

I have no clue how to codesplit translations, but it would be great to come up with a convention to lazy load translations as components are loaded into the application.

@Rich-Harris
Copy link
Member Author

Rich-Harris commented Feb 23, 2019

I've received lots of really helpful pointers both here and on Twitter — thanks everyone. Having digested as much as I can, I'm starting to form some opinions... dangerous I know. Here's my current thinking:

The best candidate for a translation file format that I've seen yet is the 'banana' format used by MediaWiki (it also forms the basis of jquery.i18n). It seems to be able to handle most of the hellacious edge cases people have written about, while avoiding a) boilerplate, b) wheel reinvention and c) being prohibitively difficult for non-technical humans to read. (Oh, and it's JSON.) The fact that it's used by something as large as MediaWiki gives me hope. If anyone has experience with it and is alarmed by this suggestion, please say so! (One thing — I haven't quite figured out where number/date/currency formatting fit in.)

No idea why it's called 'banana'. Here's an example:

// en.json — input
{
  "@metadata": {
    "authors": []
  },
  "hello": "Hello!",
  "lucille": "It's {{PLURAL:$1|one banana|$1 bananas|12=a dozen bananas}} $2. How much could it cost, $3?"
}

I think it'd be possible to compile that to plain JavaScript, so that you could output a module you could consume like so:

import t from './translations/en.js';

console.log(t.hello); // Hello!
console.log(t.lucille(1, 'Michael', '$10')); // It's one banana Michael. How much could it cost, $10?

For code-splitting, this is what I'm thinking: Suppose you have global translations in locales/en.json (or e.g. languages/en.json, per @trbrc's comment) and some route-specific translations in src/routes/settings/languages/en.json. Sapper might generate two separate modules:

  • src/node_modules/@sapper/internal/i18n/en/0.js // global translations
  • src/node_modules/@sapper/internal/i18n/en/1.js // route-specific

The second of these files might look like this:

import translations from './0.js';

export default Object.assign({}, translations, {
  avatar: 'Avatar',
  notifications: 'Notifications',
  password: 'Password'
});

The route manifest for /settings could look like this:

{
  // settings/index.svelte
  pattern: /^\/en/settings\/?$/,
  parts: [...],
  i18n: () => import('./i18n/1.js')
}

So when you first load the page, 0.js gets loaded, but when you navigate to /settings, the browser only needs to fetch 1.js (and it can get preloaded, just like the component itself and any associated data and CSS). This would all happen automatically, with no boilerplate necessary. And because it's just JSON it would be easy to build tooling that ensured translations weren't missing for certain keys for certain languages.

The banana format does steer us away from this...

<p>{t`Hello ${name}!`}</p>

...and towards this:

<p>{t.hello(name)}</p>

I'm not convinced that's such a bad thing — it's certainly less 'noisy', and forces you to keep your default language .json file up to date (which, combined with the tooling suggested above, is probably the best way to keep translations up to date as well).

@thgh yep, should definitely have some sort of fallback. Not sure what this would look like — the simplest would obviously be to just have a single default. I'm not so sure about a directive, since it would entail changes to Svelte, and would make it harder to use translations in element attributes (or outside the template, in the <script>).

@chiefjester
Copy link

@Rich-Harris I just drop it in here: https://github.com/lingui/js-lingui

What I like with this library is the tooling they have:
https://lingui.js.org/tutorials/cli.html#add-a-new-locale

Where you can:

  • add a new locale easily
  • extract translations automatically from you from components
  • cleaning up obsolete messages
  • pseudo localization
  • message as id's

The last part is particularly interesting, rather than you create a bunch of ids for translation, it uses the actual content in translating. That way it make's it easy for any translator to edit it, heck anyone with a text editor can add it and knows what to do.

eg. extracting translations from component might look like this (taken from js lingui wiki)

{
  "Message Inbox": "",
  "See all <0>unread messages</0> or <1>mark them</1> as read.": "",
  "{messagesCount, plural, one {There's {messagesCount} message in your inbox.} other {There're {messagesCount} messages in your inbox.}}": "",
  "Last login on {lastLogin,date}.": "",
}

And a translated version would look like this:

{
  "Message Inbox": "Přijaté zprávy",
  "See all <0>unread messages</0> or <1>mark them</1> as read.": "Zobrazit všechny <0>nepřečtené zprávy</0> nebo je <1>označit</1> jako přečtené.",
  "{messagesCount, plural, one {There's {messagesCount} message in your inbox.} other {There're {messagesCount} messages in your inbox.}}": "{messagesCount, plural, one {V příchozí poště je {messagesCount} zpráva.} few {V příchozí poště jsou {messagesCount} zprávy. } other {V příchozí poště je {messagesCount} zpráv.}}",
  "Last login on {lastLogin,date}.": "Poslední přihlášení {lastLogin,date}",
}

It also introduces slots, which is to be honest a big deal in i18n. With translations, you probably want to style a word inside a translation. Old solution would add a new message id for that particular item, even though the whole translation supposed to be treated as one unit. The problem taking out those text inside the translation message is that it looses context. If a translator just see a word without a context, then he/she could probably give a different translation not intended for the actual message.

I think it's the cleanest solution I have seen among any library. Shout out to @tricoder42 for creating such an awesome library.

@tricoder42
Copy link

Hey everyone, thanks @thisguychris for mention.

I read the thread briefly and I have few suggestions if you don't mind:

(Having said that, the examples on https://www.i18njs.com point towards using more complete sentences as keys, i.e. "You have %n messages" rather than "you_have" and "message".)

I would really recommend this approach for two reasons:

  1. Context is very important for translators. Translating You have %n messages as a sentence will give more accurate translation than translating You have and message.
  2. The order of words in a sentence isn't the same for different languages. Order of words/chunks hardcoded in source code might break in the future for some language.

In an ideal world, someone would have already encoded all the different pluralisation rules already in a way that we can just reuse. I don't know if that's the case.

There's actually: Plural rules for very large number of languages are defined in CLDR. There're lot of packages on NPM which parse CLDR data, like make-plural. Few languages are missing though (e.g. Haitian and Maori).

I wonder if that can be done with symbols like %c and %n and %d?

// locales/fr.json
{
  "Your payment of %c for %n widgets is due on %d": 
    "Votre paiement de %c pour %n widgets est dû le %d."
}
import { format_currency, format_date } from '@sveltejs/i18n/fr';

export default {
  'Your payment of %c for %n widgets is due on %d': (c, n, d) =>
    `Votre paiement de ${format_currency(c)} pour ${n} widgets est dû le ${format_date(d)}.`
};

ICU MessageFormat uses argument formatters:

Hello {name}, today is {now, date}

Formatting of date arguments depends on implementation, so it could be done using Intl.DateTimeFormat, date-fns, moment.js, whatever.

I've been thinking a lot about this approach and it's useful when you want to change date format in different locales:

Hello {name}, today is {now, date, MMM d}

but you could achieve the same in the code as well:

i18n._("Hello {name}, today is {now}", { name, now: i18n.format.date(now) })

where i18n.format.date is something like this:

// pseudocode using `format` from date-fns
function date(value) {
  const formatStr = this.formatStr[this.locale]
  return format(date, formatStr, { locale: this.locale })
}

I think both approaches have pros/cons and I haven't decided yet which one to use.

Code-splitting

I've just had a discussion via email with one user about this. I'm thinking about webpack plugin, which could generate i18n files for each chunk automatically. I haven't figure out how to load it automatically, but the route manifest that you've posted might solve it as well.


Just flushing some ideas I've been playing with in past weeks :)

@AlexxNB
Copy link

AlexxNB commented Feb 24, 2019

@thisguychris suggested exactly I want to see in i18l system!

And one more advantage — using t `Hello ${name}` makes the component more reusable. It is the simplest way to make i18n-ready component. Developer will not care about distributing component with lang json file(or may include it, when there are ready translations).

Perhaps, autogenerated json structure may have the structure of nested components:

{
	"App": {
	  	"My app":"Моё приложение",
	},
	"App.NestedComponent":{
	 	"Ave, ${name}!": "Славься ${name}!"
	},
	"App.AnotherNestedComponent":{
	 	"Ave, ${name}!": "Да здравствует ${name}!"
	}
}

It will encapsulate the phrases in its components. Useful for cases when same phrase may have various translations in different contexts.

@mattpilott
Copy link

I wanted to chime in just to say that all the issues mentioned in the original issue above, I am experiencing this on my latest project. It's a web version of a mobile app, needs to support 19 languages at launch and is completely api driven.

I was delighted to hear that this is being considered in sapper!

@Rich-Harris
Copy link
Member Author

Thanks @thisguychris, @tricoder42 — Lingui is incredibly impressive. The thought and care that has gone into the tooling is amazing.

I've been thinking more about strings versus keys, and I'm coming down on the side of keys, for a couple of different reasons. (For clarity, I'm not suggesting you_have and message over You have %n messages, but rather you_have_n_messages.)

Firstly, string-based approaches typically end up with the source language text included in the production build ("Hello world!":"Salut le monde!"). In theory, with a key-based approach, {t.hello_world} could even reference a variable (as opposed to an object property) if t is a module namespace, which is inherently minifiable. Even if we couldn't pull that off, property names will generally be smaller (welcome_back as opposed to "Good to see you again!"). You could eliminate source language text with a sufficiently sophisticated build step, but not without adding complexity.

Secondly, and perhaps more importantly, I worry about requiring developers to be copywriters. Imagine you have a situation like this...

<p>{t`You have no e-mails`}</p>

...and someone points out that we don't hyphenate 'emails' any more — all of a sudden the keys for your other translations are out of date, so you have to go and fix them.

Then a copywriter comes along and says that it's the wrong tone of voice for our brand, and should be this instead:

<p>{t`Hooray! Inbox zero, baby!`}</p>

Of course, that text should also eventually be translated for other languages, but by putting that text in the source code don't we decrease the stability of the overall system?

Slots

The slots feature is very cool. Unfortunately it doesn't really translate (pun not intended) to Svelte, since you can't pass elements and component instances around as values. The closest equivalent I can think of to this...

<p>
   <Trans>
      See all <Link to="/unread">unread messages</Link>{" or "}
      <a onClick={markAsRead}>mark them</a> as read.
   </Trans>
</p>

...is this:

<p>
  {#each t.handle_messages as part}
    {#if part.i === 0}<a href="/unread">part.text</a>
    {:elseif part.i === 1}<button on:click={mark_as_read}>{part.text}</button>
    {:else}{part.text}{/if}
  {/each}
</p>

That assumes that t.handle_messages is a (generated) array like this:

[
  { text: 'See all ' },
  { text: 'unread messages', i: 0 },
  { text: ' or ' },
  { text: 'mark them', i: 1 },
  { text: ' as read.' }
]

Obviously that's much less elegant and harder to work with, but maybe that's a rare enough case that it's ok not to optimise for? We can pay for the loss of elegance in other places.

Currency and date formatting

I hadn't actually realised until today that Intl is supported basically everywhere that matters. For some reason I thought it was a new enough feature that you still needed bulky polyfills.

Distributed components

@AlexxNB that's a very interesting case that I hadn't considered. I think it changes the nature of the problem though — since t doesn't have any meaning to Svelte per se (so far, we've been talking about adding the feature to Sapper) we would have to add a new primitive. Maybe it's something like this, similar to the special @html tag:

<p>{@t hello_world}</p>

But that opens a can of worms, since Svelte now has to have opinions about i18n, which inevitably leads to opinions about project folder structure etc. I think it's probably more practical if components simply expose an interface for passing in translations:

<script>
  import VolumeSlider from '@some-ui-kit/svelte-volume-slider';
  import { t } from '@sapper/app'; // or whatever

  let volume = 0.5;

  const translations = {
    mute: t.mute
  };
</script>

<VolumeSlider bind:volume {translations}/>

I think we want to avoid referencing component filenames in translation files, since it's not uncommon to move components around a codebase.

@AlexxNB
Copy link

AlexxNB commented Feb 24, 2019

Secondly, and perhaps more importantly, I worry about requiring developers to be copywriters.

Image another case: when developer changed the text value of any key in the en.json(main language of the app - the source of truth for all translators). Translators even can't to know about this fact. They haven't any built-in tool for actualizing their translations, except looking for diffs on github.
But using strings, instead keys you can make something like this:

sapper --i18n-check ru.json

And as result you can get that some phrases was gone, and some new phrases added.

@bernardoadc
Copy link

My two cents on language detection: how about some bootstrap function where the developer can do whatever he wants to detect language and return the result? This way it could analyze URL path, subdomain, cookies, whatever.. less opinionated but still very simple

@laurentpayot
Copy link

laurentpayot commented Feb 25, 2019

Since Sapper/Svelte is a compiler, what about using a single file for all the locales:

// locales.json
{
  "@metadata": {
    "authors": {
      "en": ["Lancelot"],
      "fr": ["Galahad"],
    }  
},
  "quest": {
    "en": "To seek the Holy Grail!",
    "fr": "Chercher le Saint Graal !",
  },
  "favoriteColour": {
    "en": "Blue.",
    "fr": "Bleu. Non !"
  }
}

and letting Sapper generate the respective locale files:

// en.json
{
  "@metadata": {
    "authors": ["Lancelot"]
},
  "quest": "To seek the Holy Grail!",
  "favoriteColour": "Blue."
}
// fr.json
{
  "@metadata": {
    "authors": ["Galahad"]
    }  
},
  "quest": "Chercher le Saint Graal !",
  "favoriteColour": "Bleu. Non !"
}

This way maintaining keys/values would be much easier in a single file than across several (19?!) files, don't you think? Just my $0.02…

@antony
Copy link
Member

antony commented Feb 25, 2019

If the format is compatible with the format output by tools like https://lingohub.com/ (which outputs in a format similar to what @laurentpayot has suggested), that'd be excellent.

@bernardoadc
Copy link

bernardoadc commented Feb 26, 2019

@laurentpayot but how would one add a specific language easily? The format is great, but cumbersome to add/remove languages because it means traversing the single file.

This could be solved (altough not ideally) if every sentence/word had a key/number associated. Then it would be easy to see them in that format, but stored in separate files. The "main" language (to with dev is familiar to) would dictate those keys. Any file missing them or having extra ones would be "wrong"

@laurentpayot
Copy link

laurentpayot commented Feb 26, 2019

@khullah Do you mean when several translators are involved and working together? If that's what you mean then I agree it can be cumbersome.
Removing a language from a centralized file is as simple as sed '/"fr":/d' locales.json if there is one translation per line.
I don't know for other people but at least for me modifiying, adding and deleting keys occurs much more often than adding/deleting a whole language.

@Rich-Harris
Copy link
Member Author

I really like @laurentpayot's idea. Bear in mind this can also be augmented with tooling — as long as there's a well-understood format and folder structure, you could create an interface for adding and removing languages, as well as translating specific keys (and keeping track of which ones were incomplete, etc). It could even be built in to Sapper's CLI!

While I'm here: had a good chat with @thisguychris the other day about authoring, in which he challenged my stance that we should use keys (as opposed to source language strings) for this. He likened it to BEM, having to have a naming structure for stuff that's maintained in a file to which you're tightly coupled at a distance.

I think there's merit to that claim. So while I do think that keys have some important advantages...

  • much easier to keep control of bundle sizes without convoluted tooling
  • easier to understand 'what's going on' wrt the underlying mechanics with {t.hello(name)} over {t.`Hello ${name}\!`}
  • possibility of disambiguating between translations that are context-dependent in some languages, but not in the source language
  • stability, since fixing typos doesn't invalidate translations
  • overall structure and organisation may be preferable to some

...it's true that in cases where you get the translations before you build the app, using strings might be preferable. So I guess I still lean towards keys, but my position isn't set in stone.

@Rich-Harris
Copy link
Member Author

Re disambiguation — just reading this piece that was doing the rounds today which contains a great example of a situation where using a source language string as a key will result in suboptimal translations:

An example that I love to use is the term “Get started.” We use that in our products in a lot of places and, in American English, it’s pretty standard. It’s so understandable that people don’t even think of the fact that it can be used in three or four ways. It could be a call to action on a button. Like, “Get started. Click here.” It could be the title of the page that’s showing how you get started. It can be the name of a file: a Get Started guide PDF. All of those instances need to be translated differently in most other languages.

@NikolayMakhonin
Copy link
Contributor

Localised URLs

/fr/foo

I think this is the best option, because:

  1. It will be convenient for the export function, because for each language on the server will be a separate folder with unique localized HTMLs.
    1.1. It will work without additional JavaScript (it is important for SEO), and without NodeJs
  2. Search engine bots will see the unique content of each URL.

@bernardoadc
Copy link

bernardoadc commented Feb 27, 2019

Yes @laurentpayot, that's what i've meant, but not only that. It would be difficult to have some sort of phrasing dictionary from other projects to import from, which would be a great thing. I think removing a language occurs less then adding one.

That beeing said, it does help human translators to see and understand context, provide a tasklist, help mantain all langs in sync, etc, as mentioned by @Rich-Harris . And this is actually something I would thrive for - promote whatever is better for the devs (compilation capabilities should be explored at maximum, it is the distinguishing feature from all other frameworks after all).

Actually.. just realized that would not be hard to take someLanguage.dictionary.json and pre-fill in that format as well, since keys are kinda like nickames to each phrasing. "Hello" would be filled with a default translation, which translators could later adapt if necessary for the given project.

Even more, several files could provide better context + modularization:

// greetings or home or xComponent.i18n.json
{
  "hello": {
     "en": "Hello!",
  ...
}

// yComponent.i18n.json
{
  "message": {
     "en": "some message",
  },
  "variants": {
     "en": ["some message","same message!!","Hey, another message"]  
  },
  ...
}

So yeah, I like your format :)
I wouldn't even compile to all '19' files, just leave as is. A single i18n file per component/module/context. How it will be loaded onto the app doesn't matter to me, as long as it works.

note: l10n of currency, numbers and dates would be in yet another (global) file (if needed, since there is moment.js etc)

// en.l10n.json — input
{
  "number": { ... }
  "date": {
    "short": "",
  },
  "currency": "U$ #,##0.00"
}

@Rich-Harris

<p>{t.hello(name)}</p> seems fine to me and goes pretty well with the above format

The slots feature is very cool

Yeap. Way better than the second example you gave. Didn't catch why it isn't simple to do?

@Rich-Harris
Copy link
Member Author

Didn't catch why it isn't simple to do?

It's just a difference between how React and Svelte work. In React, elements are just variables — you can do this sort of thing...

var element = <p>some text</p>;
return <div>{element}</div>;

and by extension, you can transform the <Trans> component in Lingui into something that can move chunks of virtual DOM around at runtime depending on word order.

In Svelte, everything is precompiled. The assumption (which holds for 99% of cases, but not this case) is that the structure of your application can be known at build time. Using that, it can generate code that starts and updates much faster than is possible with a virtual DOM; the lack of support for Lingui-style slots is the tradeoff.

@saabi
Copy link

saabi commented Feb 28, 2019

It seems nobody mentioned URLs were originally intended (if I recall correctly) to serve as a single source for a particular piece of information, independent of presentation. That way, they could either present the information resource in English, Klingon, JSON, binary or whatever, depending on the HTTP negotiation.

Nobody does this nowadays, for good practical reasons (which also depend on available technology, which could change), but it was the original intent. And though I may be biased, because the purist in me likes the theoretical elegance, I think the option should be left open for that.

Also, the language selection mechanism should be selectable itself. We should be able to configure, or customize, how Sapper determines the language.

Localized URLs.

I like the idea, but keeping in sync with what I said before, THEORETICALLY, there should be a canonical URL that can present data in any language, also including machine readable ones, and then you can have alternative localized URLs to the same resource, which may suggest a presentational preference for its content.

For example...

  • canonical: my.site/some/resource -> can present in any format (English, JSON, French, etc, depending on HTTP neogtiation or other Sapper configurable selection mechanism)
  • JSON: my.site/api/some/resource or json.my.site/some/resource (configurable)
  • French: my.site/fr/une/resource or fr.my.site/une/resource or my.site/une/resource (also configurable..)
    etc. ...

Anyway, all I'm saying is we should keep that flexibility.

EDIT:
In other words, it's recommended (by the designers) that the URL -E> Resource relation is many to one rather than the inverse. I'll go and find a reference anyway, tomorrow.

And then again, it's OK to think of the same information in another language as a separate resource.

@ocombe
Copy link

ocombe commented Mar 1, 2019

Hello there! I'm a member of the Angular team, and I work on i18n there. I thought that I could share some of my knowledge to help you get started:

  • if you can avoid to touch date/currencies/numbers and use intl instead, it's better. Dealing with those is a major pain, you'll discover new traps every day: people that don't use the Gregorian calendar, left to right languages, different number systems (arabic or hindu for example), ... For Angular we decided to drop intl because of browser inconsistencies. Most modern browser have a good intl support, but if you need to support older browser then you'll have bugs and differences. In retrospect, sticking with intl might have been a better choice...
  • all major vendors (IBM, oracle, google, apple, ...) use CLDR data as the source of truth: http://cldr.unicode.org/. They export their data in xml or json (https://github.com/unicode-cldr). We use the npm modules "cldrjs" and "cldr-data-downloader" (https://github.com/rxaviers/cldrjs) developed initially for jquery globalize to access the CLDR json data. We also use "cldr" (https://github.com/papandreou/node-cldr) to extract the plural rules. You can find our extraction scripts here: https://github.com/angular/angular/tree/master/tools/gulp-tasks/cldr if you want to take a look at it.
  • if you can, use a recognized format for your translations so that you users can use existing translation software. One of the main formats is XLIFF but it uses XML which is very complicated to read/write in js. Stick to JSON if you can. There are a few existing JSON formats that are supported by tools, you should research the existing ones and choose one of them, it'll make the life of your users so much easier, and you will be able to reuse some external libraries. Some examples are i18next JSON https://www.i18next.com/misc/json-format or Google ARB https://github.com/googlei18n/app-resource-bundle/wiki/ApplicationResourceBundleSpecification. Don't try to reinvent the wheel here.
  • For plural rules, use CLDR data http://cldr.unicode.org/index/cldr-spec/plural-rules
  • ICU expressions are a nice way to deal with plurals, ordinals, selects (gender), ... but there is no documentation for js... you can read a bit here: http://userguide.icu-project.org/formatparse/messages and on the angular docs https://angular.io/guide/i18n#regular-expressions-for-plurals-and-selections
  • you need to follow a rule for locale identifiers. I recommend BCP47 which is what CLDR uses with a few optimizations (http://cldr.unicode.org/core-spec#Unicode_Language_and_Locale_Identifiers), some doc to help you pick the right identifier: http://cldr.unicode.org/index/cldr-spec/picking-the-right-language-code
  • id or non-id based keys: use either auto generated ids (with a hashing/digest algorithm) or manual id (keys that the user specifies). Never use the sentences as keys because you'll run into problems with your json and some special characters, you'll get very long keys which will increase the size of the json files and make them hard to read, and you'll get duplicates (the same text with different meanings depending on the context), which brings me to my next point...
  • you need to support optional descriptions and meanings, those are very important for translators. Descriptions are just some text that explains what this text is, while meaning is what the translators should use to understand how to translate this text depending on the context of the page and what this text represents. The meaning should be used to generate the ids (keys) so that you don't have duplicates with different meanings.

@LorisSigrist
Copy link
Contributor

A combination of #5703 and #11223 would enable i18n routing to be built entirely in userland.

@stepanorda
Copy link

@Rich-Harris We have more and more libs, each having a lot of problems or even broken by design (race conditions and global state). Most of them are not maintained well. And this is one of the core things most projects outside of North America need. I think we need an official library or even build this into the framework itself. A lot of my friends that I recommend SvelteKit to don't want to migrate from Nuxt precisely because of the lack of proper i18n. Both Nuxt and Next have official support for i18n.

@xpuu
Copy link

xpuu commented Dec 20, 2023

Although it would be nice, I don't believe that a complete official library is desperately needed at the moment. A well-designed and documented API, enabling to hook into routing system (for managing routes and possibly domains) and the build system (for generating texts and assets), will be more than sufficient for starters.

@jasongitmail
Copy link

Related:

I added i18n support to super-sitemap (npm) for SvelteKit, which takes care of the sitemap aspect, assuming a developer's URLs use a pattern such as /about (for the default lang) and /zh/about, /es/about, etc (for alternate langs).

The generated sitemap will include proper multi-lingual site annotations as Google expects.

@vytenisstaugaitis
Copy link

@Rich-Harris We have more and more libs, each having a lot of problems or even broken by design (race conditions and global state). Most of them are not maintained well. And this is one of the core things most projects outside of North America need. I think we need an official library or even build this into the framework itself. A lot of my friends that I recommend SvelteKit to don't want to migrate from Nuxt precisely because of the lack of proper i18n. Both Nuxt and Next have official support for i18n.

There is an entire paragraph at the end of SveleteKit 2 announcement addressing this issue.

@jacob-8
Copy link
Contributor

jacob-8 commented Dec 21, 2023

Although it would be nice, I don't believe that a complete official library is desperately needed at the moment. A well-designed and documented API, enabling to hook into routing system (for managing routes and possibly domains) and the build system (for generating texts and assets), will be more than sufficient for starters.

I completely agree. The basic tasks of i18n are easy in SvelteKit once you understand how to use it's various routing techniques to set locale per request (see my instructions on how get going w/o race condition problems - and note that you don't even need a library). As I see it, SvelteKit support would be for the purpose of making advanced tasks simple. Things like:

  • translated routes (I wouldn't use this at the moment)
  • easy code-splitting of translation strings (I would love this!)

@LorisSigrist
Copy link
Contributor

The one advantage I could see to built in i18n support w/ locale management instead of just router hooks is per-language code splitting. I don't think it's possible to code-split your translation strings both by language and per page unless each languages has it's own entry point.

@samuelstroschein
Copy link
Contributor

@stepanorda we (https://inlang.com/), most notably @LorisSigrist who develops Paraglide JS, are pushing PRs (#11396, #11178) towards SvelteKit in collaboration with the Svelte team now.

In short, i18n should become better for SvelteKit soon. Higher prioritization from the Svelte team would be welcome ofc.

@samuelstroschein
Copy link
Contributor

Progess 🎉 SvelteKit 2.3.0 includes #11537 from @LorisSigrist

@LorisSigrist
Copy link
Contributor

Further Progress - We released Paraglide JS Adapter SvelteKit which makes use of the new reroute hook + a link preprocessor to enable a bunch of new features:

  • Zero Effort i18n Routing
  • Automatic Link Translation
  • Translated Pathnames

It's still in prerelease mode, but we would love to get some feedback!

@eddow
Copy link

eddow commented Apr 18, 2024

Something I quickly made for my application is composing keys.
For example field.name or field.date. This is really practical when you make a generic fields list and can use field.${name} for the label. Or for when you use menu entry points, enum entries, ... Where the keys do not have to write all the keys manually but let components generate the keys a standardized way.
With the store, it could end up using $t.field.date or $t.field[fieldName]

@SrGeneroso
Copy link
Contributor

SrGeneroso commented Apr 23, 2024

I don't know if sveltekit has to include i18n directly, but the site and docs should. I've being going around some official international channels in discord and there is not much to see. Mine (spanish) has a fork deployed with docs for svelte but not sveltekit. I think if the main domain had that internationalization integrated more people would collaborate on the translation and expansion of svelte. It seems paraglide could be our savior, but I wonder if the internal team is working on gathering the resources around to make it possible.

@eddow
Copy link

eddow commented May 3, 2024

Hi guys. I don't have anything to sell and really just wish to share good experience. For a summary, I used to learn 80x86 assembly, hungarian and typescript (among other things)
I had to implement the i18n part from scratch like a dozen times and, because I got lazzy about it, I made a library out of all my trials/errors/mistakes/successes.
I don't say you should use this one exactly, but it has some well & deeply thought concepts and structure.
https://github.com/emedware/omni18n
(for example the concept of "software zone" that I didn't see anywhere else, though it might exist, I'm used to re-invent the wheel)

I'm like you, I hope svelte will keep on kicking a*ses, I just make stuff available, I hope it will fuel your brainstorming and inspire.
Note: I'm working now on the svelte4 adaptation.

@ivands
Copy link

ivands commented Sep 29, 2024

I want to share my idea of using the Svelte compiler to make i18n a lot simpler.

I wrote a working plugin to show the idea:
https://github.com/awsless/awsless/tree/main/packages/i18n

My suggestion would be to make a custom $t rune.

$t`${count} count`

The compiler will try to find all the translations for the given template string in a global JSON file.

// locales.json
{
 "${count} count": {
    "es": "${count} contar",
    "jp": "${count} カウント"
  }
}

If the compiler finds existing translations we replace the original rune with:

$t.get(`${count} count`, {
  es: `${count} contar`,
  jp: `${count} カウント`,
})

The $t.get function returns the text of the current locale.

If the compiler didn't find any existing translations we add the text that needs to be translated to the file.

// locales.json
{
 "${count} count": {}
}

From this point on, the developer can translate all the text inside locales.json with any tool he likes.
For example, you could easily feed the file to some AI bot.

@bugproof
Copy link

@ivands inlang already has this feature I think via sherlock vscode extension. I think nothing beats inlang when it comes to i18n rn.

@ivands
Copy link

ivands commented Oct 13, 2024

@bugproof, With inlang I personally dislike the fact that you will need to think of property names for every text that needs to be translated.
Coming up with names is hard.
You also need a vscode plugin to see the text in your app.
And it also adds more unneeded complexity to your app.

The $t rune works in a much simpler way.
And with my Vite plugin example, it only adds 373 bytes (uncompressed) to your app.
https://github.com/awsless/awsless/tree/main/packages/i18n

@samuelstroschein
Copy link
Contributor

samuelstroschein commented Oct 13, 2024

@ivands you need identifiers for messages. here is an in-depth answer why ids are required https://github.com/orgs/opral/discussions/599#discussioncomment-5754261

extracting messages as text seems easier at first until you need to collaborate with translators, designers, product managers, etc. inlang started out with using text as identifier like your solution. see this Reddit post from 3 (!) years ago.


@ivands thanks for the compliment re inlang ❤️

@ivands
Copy link

ivands commented Oct 14, 2024

@samuelstroschein, I think the downsides of using the English source text as ID don't apply to my example.
Let's say we change the source text of text that has already been translated.
If the source text no longer exists, we simply remove it from the translation file.
Yes, we need to re-translate the new source text. But you need to do that anyway.
I would also argue that removing the reference to the old translations is a good thing because at least now you clearly know what text still needs to be translated.
Otherwise, the new source text still has a reference to older translations and without updating the translations it would be wrong.

Also in my example, I use AI to translate the text, this would negate the problem altogether.

@stepanorda
Copy link

stepanorda commented Oct 14, 2024

@ivands you can't rely solely on AI for translation. At least not yet. Having English as a source is unmaintainable on large projects and teams. I can give you a very simple example: Imagine you missed a coma in the English version. Would you re-translate the whole thing for all the languages? Or manually edit your JSON? And all 100 mentions of it all over your code? If you think that AI is good enough, maybe that's Ok, but for most, it's not. So somebody already translated this string, and it was fine, now he has to go and translate (or at least check the output of the AI) again for all languages?

@vytenisstaugaitis
Copy link

@ivands you need identifiers for messages. here is an in-depth answer why ids are required https://github.com/orgs/opral/discussions/599#discussioncomment-5754261

extracting messages as text seems easier at first until you need to collaborate with translators, designers, product managers, etc. inlang started out with using text as identifier like your solution. see this Reddit post from 3 (!) years ago.

@ivands thanks for the compliment re inlang ❤️

Agree 100%. On certain projects, we use an i18n tool that uses text as a source, and it's such a pain when the source text needs to be edited or changed. Thankfully, it doesn't happen often, but when it does, it's pain.

@bugproof
Copy link

bugproof commented Oct 14, 2024

@ivands

With inlang I personally dislike the fact that you will need to think of property names for every text that needs to be translated.

With Sherlock you don't have to. It generates random identifiers for you. opral/monorepo#1892
Also you have tight framework integrations with inlang and multiple localized routing strategies. Personally using inlang for both next.js and sveltekit apps and so far it works good in prod with SEO support too.

@mathg
Copy link

mathg commented Oct 28, 2024

With the inclusion of Paraglide in the new svelte CLI, does this mean it is the official recommended solution for i18n?

image

@vytenisstaugaitis
Copy link

With the inclusion of Paraglide in the new svelte CLI, does this mean it is the official recommended solution for i18n?

image

I think it's a recommended solution since SvelteKit 2. It works and it works smoooooth.

@gabrielstellini
Copy link

gabrielstellini commented Oct 29, 2024

With the inclusion of Paraglide in the new svelte CLI, does this mean it is the official recommended solution for i18n?

https://inlang.com/m/gerre34r/library-inlang-paraglideJs/usage#complex-formatting

The Message Format is still quite young, so advanced formats like plurals, formatting functions, and markup interpolation are currently not supported but are all planned

Most projects I worked on require plurals - and many languages handle these differently so it's not just a simple situation of a switch/case in JS.

@willfarrell
Copy link

@gabrielstellini 100%. We talked to a bunch of translators before settling on fluent for all of our translation files. Their js lib doesn't support tree-shaking, so we build https://github.com/willfarrell/fluent-transpiler to meet our needs.

@samuelstroschein
Copy link
Contributor

@gabrielstellini Paraglide JS has variant support, which includes pluralization, gendering, and more use cases on the dev branch.

The release of Paraglide JS 2.0 is blocked until we have lix 1.0. You can read more here. In addition, Paraglide JS 2.0 can load any translation file. Doesn't matter if it's ICU, Fluent, arb, XML, or something else (@willfarrell).

When is the release of Paraglide JS 2.0? Probably Q1 2025 if we manage to get lix 1.0 stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature / enhancement New feature or request p1-important SvelteKit cannot be used by a large number of people, basic functionality is missing, etc. size:large significant feature with tricky design questions and multi-day implementation
Projects
None yet
Development

No branches or pull requests