Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capital inconsistencies between languages #416

Closed
axelboc opened this issue Apr 11, 2021 · 3 comments · Fixed by #435
Closed

Capital inconsistencies between languages #416

axelboc opened this issue Apr 11, 2021 · 3 comments · Fixed by #435
Labels
content Content changes, map improvements, translation fixes, etc.
Milestone

Comments

@axelboc
Copy link
Collaborator

axelboc commented Apr 11, 2021

Looking at #409, I noticed that the proclaimed capital of Palestine is actually inconsistent on Wikipedia across languages:

It's an even split! 😂 I decided not to fix it right away, as I thought it required further discussion... so here I am.

It actually reminded me of other capital inconsistencies we've discussed in the past, like Kiribati #166... So for fun, I decided to do a quick review of all the capital inconsistencies currently in the deck:

  • Montenegro: ES has "Podgorica, Cetiña" but all other languages have only Podgorica.
  • Palau: FR has "Melekeok" but all other languages have Ngerulmud.
  • European Union: NB has "Brussel, Strasbourg, Luxembourg" but all other languages have only Brussel.
  • Sri Lanka: NB has "Colombo" but all other languages have Sri Jayewardanapura Kotte. Norwegian Wikipedia now lists both: "Sri Jayewardanapura Kotte, Colombo"; it's better but it's still different.
  • Tuvalu: NB has "Vaiaku/Funafuti" but all other languages have only Funafuti. Fortunately, this has been fixed on Norwegian Wikipedia.
  • Kazakhstan: NB has "Astana" but all other languages have Nur Sultan. Fortunately, this has been fixed on Norwegian Wikipedia.
  • Kiribati: NB has "Bairiki (på Tarawa øy)" but all other languages have Tarawa. Wikipedia has switched to South Tarawa almost across the board, including in Norwegian, but the capital is written as "Tarawa (Jižní Tarawa)" - i.e. Tarawa (South Tarawa) - on Czech Wikipedia.

Looking at all of this, there are a number of incorrect or outdated capitals that clearly need to be fixed (Palestine, Sri Lanka, Tuvalu, Kazakhstan and Kiribati).

Assuming we do (I'll open a PR), we'll be be left with the following inconsistencies:

  1. one where half of the languages have a different capital than the others: Palestine;
  2. one where the capital is completely different in one language only - i.e. "Melekeok" for Palau in FR instead of Ngerulmud;
  3. three where multiple capitals are listed in one language only: "Sri Jayewardanapura Kotte, Colombo" for Sri Lanka in NB, "Brussel, Strasbourg, Luxembourg" for European Union in NB, "Podgorica, Cetiña" for Montenegro in ES;
  4. one where the capital is written in a specific way in one language only - i.e. "Tarawa (South Tarawa)" for Kiribati in CS instead of South Tarawa.

So my question is: could we amend our guidelines to remove the 3rd and 4th kinds of inconsistencies?

For 3., the other capital candidates are mentioned in the Capital info field anyway, and for 4., South Tarawa is just the more "precise" capital. Any thoughts?


EDIT I've opened two PRs to update some the capitals as per Wikipedia: #417 #418, and one issue to deal with the case of Palestine, since it's more complex: #419

EDIT Closed #418 as Wikipedia can't make up its mind about the capital of Kiribati.

@axelboc axelboc added the content Content changes, map improvements, translation fixes, etc. label Apr 11, 2021
@axelboc axelboc added this to the v4.2 milestone Apr 11, 2021
@axelboc
Copy link
Collaborator Author

axelboc commented Apr 17, 2021

I completely forgot that the Norwegian deck doesn't use Wikipedia as its first source, so its capitals are correct as per the current guidelines.

As discussed in #417, we should consider updating the Translation sources for Norwegian so that the site of the Ministry of Foreign Affairs of Norway, which has last been updated in 2013, is no longer the main source. This will help remove some of the inconsistencies.

I'm no longer convinced that the guidelines need to be amended any further.

However, we should make sure that we follow the guidelines correctly by removing alternative names/spellings from the Capital field and moving them to the Capital info field:

  • Helsingfors (Helsinki)
  • Santiago (de Chile)
  • Ouagadougou (Wagadugu)
  • (Santa Fe de) Bogotá
  • Bairiki (på Tarawa øy)

@axelboc
Copy link
Collaborator Author

axelboc commented Apr 17, 2021

I'm no longer convinced that the guidelines need to be amended any further.

On second thought (sorry 😅)... Looking back at the cases of Palestine and Kiribati, especially, it's clear that our guidelines still have some limitations.

Perhaps we should reconsider our policy of following each localised Wikipedia (or translation sources) for capitals, and instead take the capitals from English Wikipedia and just translate them... like we do for countries and flags basically 😄

I can't find the original discussion that led us (me?) to choose this policy (#210 and #255 are as far as I could get), but now that we have so many translations, keeping each deck in sync with its corresponding localised Wikipedia is clearly impractical ... and it's only going to get worse the more languages we get!

If we decide to use English Wikipedia as the source of truth, and a capital keeps changing back-and-forth between two names there, then we only have to resolve this volatility in one place, by finding better sources and discussing the matter on the country's Talk page.

Of course, things such as spelling, alternative names, etc. would remain sourced from each localised Wikipedia (or translation sources) independently.

Making this change to the guidelines would resolve all capital inconsistencies across languages, period:

  • If a localised Wikipedia can prove English Wikipedia wrong with an official and universal source, then the correction should be made to English Wikipedia as well.
  • If there's no official and universal source anywhere, and the inconsistency revolves around the level of administrative division to consider as the capital, like for Kiribati, Palau, Sri Lanka, etc. I'd argue that following English Wikipedia is a valid choice.
  • If a localised Wikipedia lists more than one capital where English Wikipedia lists only one, like for the European Union, then the other capitals are most likely already mentioned in the Capital info field, so again, I'd argue that following English Wikipedia is a valid choice.
  • If a localised Wikipedia uses fewer or different words to qualify the capitals (e.g. no mention of de facto), then English Wikipedia is likely more accurate. If it's the other way around, then English Wikipedia is likely in need of a fix.

@aplaice
Copy link
Collaborator

aplaice commented Apr 17, 2021

from the other issue:

and I can't actually find any case where it led to a factually worthwhile inconsistency...

In principle, conflicts about the actual capital could have been interesting, say due to differences in the definition of a capital in different languages, but I think you're completely right that in practice there haven't been any such cases!

Aside on Montenegro's capitals

For instance, it's vaguely interesting that Spanish Wikipedia had decided that both Cetinje and Podgorica were capitals of Montenegro (rather than one being an honorary capital and the other an actual one), but I don't think it's indicative of any deeper differences in how the Spanish language or Spanish people think of capitals or cities, but was more likely just an accident of editing. (The fact that the Serbo-Croatian Wiki describes Cetinje as the "throne capital" might be, but even there I'm not sure (e.g. Serbian Wikipedia just lists Podgorica as the capital). It might perhaps be worthwhile to leave a loophole for the capitals of countries, in that country's native language(s), in case people "on the ground" have very strong opinions about precise details, in such a situation. I don't think we have any such situations atm though, and we can worry about them when/if we do.)


I fully agree with your points about the different possible scenarios — the "real" capital (or the best guess of what the "real" capital is) should be the same irrespective of language.


Of course, things such as spelling, alternative names, etc. would remain sourced from each localised Wikipedia (or translation sources) independently.

Yeah, definitely! There are many cases where another language has multiple names for an entity, but English
only has one and vice-versa. (Hence, some care will still be needed with the Country/Capital infos and with choosing the correct name/spelling as the "main" version, but at least we won't have situations where different languages list different "entities" (or even different numbers of "entities") in the capital field...)


I can't find the original discussion that led us (me?) to choose this policy (#210 and #255 are as far as I could get),

I'm not sure either. It seemed like a good idea at the time (I was totally in favour!), allowing us to be "language-neutral", but with hindsight it was a large amount of effort for effectively no gain, and with so many languages it's untenable, as you wrote.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Content changes, map improvements, translation fixes, etc.
2 participants