-
-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch skycultures to the new format #3751
base: master
Are you sure you want to change the base?
Conversation
OMG, translators will hate us for that. Back to start for everything? Review all Google translations again? Any chance to see the old tranlsations? |
The cultures in the external repo have some customized texts, so if we import them, the translations will have to change one way or another. One way to go would be to start with converting all the current cultures to the new format, and only then replace them with the ones in the external repo. But anyway, something must be done with the translations at some point—now or after the separate import, and this does imply a large review.
Yes, I expected this. The change is huge. |
Hello,
I think the old translations for object names (constellations etc..) should be more or less preserved with probably some errors (Ruslan can you confirm this?). But clearly the existing translations for the sky culture descriptions are lost. Most of the translations in the stellarium-skycultures repo were generated with google translate, and I still think auto-translation is the way to go for those long texts, but with better AI-based tools. Some tests I did showed that ChatGPT can perform remarkably well for many languages, much better than google translate (especially when passing a meaningful context in the prompt). For example I don't think I could do a better job than ChatGPT in French.
Yes, the repo already contains a documentation in the README.md. It's not enough but it's a good start. |
The regions in new format (and in Mobile and Web editions) are different in comparison to Desktop edition (or old format) - I think we should use one universal list for regions (at least for SC) for all editions of planetarium. |
They don't seem to have been copied from the original sky cultures. E.g. in Anutan original:
and new:
The lack of the dieresis in the first name and failure to capitalize the second one compared to their old versions hint that they were translated independently. Even worse, there are simply wrong translations, e.g.:
becomes
Here in the new format the plant (vegetation) is translated with its second meaning (factory), and also is sloppy grammar-wise. |
This is why all these machine translations (which of course have no context) must be marked unreviewed and reviewed (again) by a human with fitting background knowledge. This is a huge effort. Of course, the unreviewed "candidates" can go into the releases as before, to be found by all users. Should we add a "You found a suspect translation? Go to [Transifex] to help!" button to make that even more visible? (Of course also a note in the 24.3/24.4/25.1 release notes, but who reads them :-) The user translation again needs review/approval, of course. |
I think it's better to improve the context passed to ChatGPT until everything is correct in the languages we know like Russian, German and French. Then use the same context for all languages to minimize the amount of errors. Note that when I created the new format I tried to re-use the existing translations as much as I could, so I am not sure why it diverged in your examples.. |
Major SCs may have "canonical" translations in use for decades in the major languages where relevant books appear. These should be preferred (with a note like "German translations following X.Y. (1976)"!) over self-made translation dabbles or AI tools. |
Immediate reactions/ thoughts:
|
Further comments on the format
Can we find a solution for these cases to use the image in the "illustration" folder directly in the description? This concerns the following SCs:
Should we define a sort of template or "standard" ("one to rule them all" will not really work but maybe guidelline?) for the description
|
I think in our context "Western" has always predated the Iron Curtain meaning by centuries. What is commonly understood by "western" is European scholarship from the age of enlightenment but rooted in European antiquity (traditionally executed in universities and Academies of Science from Lissabon to St. Petersburg), as opposed to e.g. Islamic, Chinese, Indian, and indigenous traditions in other continents which are, in western scholarship, usually dealt with in "ethnographic studies". Still, we have agreed to rename all Western* to Modern*. |
Yes, we should use the images from the illustrations/ subfolder directly in the description. There is nothing preventing this from a technical point of view. In general in the new format I really encourage to avoid adding a section dedicated to each constellations outside the already existing ## Constellations section. The code then cross-match the content with the content of the index.json file, so it's usually not even necessary to link to the image at all.
It's already like that. The template for the markdown file has a strict structure with mandatory sections. |
Yes.. In Stellarium Mobile we didn't switch because this work predated the renaming. I am a bit worried to do that now because in practice the "Modern" name seems to be annoying some users.. I have seen angry emails.. But I guess we will also need to switch.. Hopefully we won't receive too many bad reviews.. |
Everyone I know who uses localized software expects the translations to be good—at least made by people who speak both the source and the target languages. They definitely don't think of it as "reading something in a foreign language". Moreover, many users don't even read in foreign languages well enough (or at all) to be able to cross-check anything. In my view, using an unedited machine translation is just a mark of poor quality of the product (which unfortunately applies to lots of commercial software nowadays, even those products that used to have great localizations two decades ago). Anyway, I'm now going to switch to a bit more conservative approach for this PR and convert all "old" sky cultures to the new format, so that we could handle the switch to the new ones in a separate thread, with all the problems of the translations. |
To be more precise, "Modern" are those from the 20th century and later that obey IAU constellations and borders. These are our default and some variants ("single presentations" after Rey, S&T, Hlad, others?). What did we decide on European 17-19th century atlases? (Or are they just "Hevelius", "Bayer", "Bode (1782)", "Bode (1801)" etc.?) In this respect, we could still call our default (classic Stellarium) "Default" or even "Stellarium", pointing out the originality of Johan's figure set [which has been taken over successfully outside the project] and giving us all liberties about what to include, and the others "Modern-S&T", "Modern-Rey" etc. |
your opinion! in reseach "western" is used in the recent decades by scholars west of the iron curtain (=western europe + n.america) |
hmmmm... Thinking of software: I think, you are right, that's a bit different. we expect the translation to be good enough that we don't need to understand the technology before reading the text that explains it (which makes the text useless). |
So, is Western Physics much different from Physics researched in Beijing? |
in my childhood, we called it "modern physics"/ "modern science" and not "western": that's what I am saying. if you want to politically frame a term (which was done in this time), you need to find differnt terms for things that have nothing to do with the negatively framed terms: like science. China has confuzianism in addition to modern physics. |
yes, I hope so, too... maybe point them to me in this case. In the 1990s we (east-germans) have undergone a linguistic re-education: suddenly, many terms were used differently and some terms were "forbidden" or meant sth. else ... as this influenced me rather deeply, I think a lot about the terms. I certainly do not want to 'always go back' but in contrast, I am embracing change. However, I think, in some cases the "newer" version does not really make sense. In case of the "western", I have the impression that it is both, a) too politically charged in whatever direction ('good' for one is 'bad' for others) and b) sometimes really confusing (because, e.g.. depending on the context "western" means different things: sometimes, I really have to think about the meaning of a sentence). |
Sure, you call that my opinion. But I feel I am not alone. The rest of the world still uses and understands the term "Western Science" without problems. Quick example: https://en.wikipedia.org/wiki/The_Beginnings_of_Western_Science This is fully non-political. Sorry, but maybe it was your childhood experience that was politicized by the powers around you then, when everything from the "West", even the European science tradition, had to be presented in a bad light or needed a new name in the GDR. But even the Soviet A-bomb is based on "Western" 20th century physics. (Not only thanks to Klaus Fuchs. The physics behind it was discovered in the European physics tradition of science, in North America, while in Germany a non-Einsteinian "German Physics" was tried and failed instead. There is probably just one unpolitical way nature behaves, and our scientific understanding (call it European, Western, Modern or what you want) seems to provide the best model, despite shortcomings). The political East/West separation is a post-1945 (no "Eastern Block" before that) thingy that we had all hoped to have overcome in 1991. Before that there was of course the Christian East/West divide which had a strong influence in traditions and beliefs, but royal courts were closely related from UK to Russia, which of course was also an imperialistic monarchy by undisputed Grace of God that tried its best to be European (Western). I cannot say whether "East" was then not rather understood as "oriental, Ottoman" etc. OK, we have gone largely off-topic, and I would stop here. Above, I had suggested possibly renaming our own default "Modern" SC into "Stellarium" (to give us all liberties on style and displayed objects), and use Modern-* for those IAU-constellation aware SCs where traces of Western-* naming may still be found. I did not suggest renaming anything back to Western-* because of your expected opposition, although almost everybody was OK with that name. |
The term "Oriental" also depends from context: sometimes it is China, sometimes it West Asia. that is why there are terms like "Near East", "Middle East" and "Far East" which don't make sense.... "east" and "west" are defined by Aristotle as directions (since more than 2000 years clear). The sense comes in when you define the vertex where the vector starts. ... I really have more important things to do. Let's just happily disagree ... we will never have a consensus here. |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
@10110111 should be description is translated in GUI? |
What do you mean? It should look the same way it did with the old format, i.e. names in the list should be translated, description text too. |
Sorry, it was mistake on my side |
Some first thoughts: if edges_type=="iau", the actual edge definition could be read from a common file. Those edges are strictly defined from "sharp" RA/DEC of equinox B1875.0 (see data/constellations_spans.dat used for identification of object or mouse location, a later addition...), but were originally (by the founder team) given in decimal coordinates already precessed to J2000 (data/constellation_boundaries.dat). Re-converting those to at best arcsecond resolution J2000 coordinates may introduce errors. There are also SCs (@sushoff explained this to me, I hope I recall correctly), actually passed down to us in historical maps in which borders could be defined, of course most easily in coordinates at the respective map's equinox. Therefore I'd recommend to allow a choice of coordinates: equatorial/ecliptical (may help defining Lunar stations/mansions?) and an epoch entry, from which the actual vertex coordinates should be precessed/converted to "equatorial J2000" at loading time. |
For the epoch of the boundaries I added |
Next question: When developing a SC, I like to take notes in comments, like "star names found on map 23", "stick figure from map 12, not 14", .... These need not be displayed and need therefore also not be transferred in any mobile app or packed in distributions. Can we add comments to the JSON which are then best stripped away during packing? (JSON usually does not support comments, but workarounds exist.) Same goes with the .md files. https://stackoverflow.com/questions/4823468/comments-in-markdown/20885980#20885980 may work, any other thoughts? |
For map entries you could add a ...
"comment": " Constellations for the Tibetan skyculture. Started as copy of default constellation lines. Only Zodiacal and a few northern constellations have been activated.",
"constellations": [
{
"id": "CON tibetan Lib",
"lines": [[77853, 76333, 74785, 72622, 73714, 76333]],
... If we agree on this style, I suppose the format document could be amended to fix this as part of the format, reserving "comment" as a keyword for comments. As for the comments in Markdown, note that these will be visible to the translators, because now the translation works per section, rather than per HTML tag. Maybe it would be simpler to just use the HTML-style comments, since the rendered document will be an HTML anyways (converted by md4c). |
Can we use "comment" everywhere once per node please? I see need at least per-constellation/asterism/star name. OK for HTML-style comments in MD, thanks. I hope the translators will not bother... |
The "comment" key in JSON objects sounds good to me. We might also add comments for translators ("tcomment"? , but we can see that later. For .md files I'm more reluctant because it's just going to fill the translatable text with hidden content, confusing the translators. Most translations services also ask to pay per word, which is not good in the case of extra text.. Note that some sky cultures have a doc/ directory used to contain extra information for authors. That might be enough? |
OK, paying for translating hidden comments is of course nonsense... Maybe we could develop an annotated source format in the doc dir which is then stripped by some simple tool. (just use sed to delete all lines starting with %# or so, or some tool that strips HTML comments). As content creator, I know how important comments are months or even years later to trace your steps and decisions. The doc dirs then need not be delivered into the installable apps. I have no daily need to edit .md, so I don't know a good editor that shows source+result in the way that we need. I know not all are equal. Any recommendations? This is of course also important to future SC contributors. Source references for star names could be added as optional dictionary entries in the star name set. I assume the format was created in this way so it will allow us to easily extend the set beyond the "english" entry. |
Maybe we could use a fixed format for the comments, e.g.
This then will make it possible to easily strip them using a regex like |
OK for me, even though I'd prefer to avoid comments at all in the .md |
In some source document comments and sidenotes will be absolutely needed if we receive one-shot (unmaintained) contributions. These can be stripped off from an .md as Ruslan suggested, or we must invent yet another "contributor source format". Not allowing HTML tags in comments should be possible, when the commented-away text is at best marked up in MD. |
1. Concerning the boundary definitions, I see two different challenges:a) @gzotti yes, there are historical maps that have coordinates and boundaries but not according to coordinates - historical maps have "cloudy" boundaries: example the Bode map. I didn't use something like that yet, but I know some historians who may want to use it. They cannot use stickfigures for it as there are not necessarily stars, and for the artwork they will use the artwork of the historical map. Thus, we need to offer another functionality that may allow cloudy boundaries. On the other hand: I really don't know how frequent this occurs in history (will occur in Doris's material, I guess), so I am unable to judge the urgency of this function. b) for my own work in ancient history, where no boundaries have ever been defined, I used the mathematics of a "convex hull" of all stars below a headline (=constellation name) in a star catalogue. This way, I created "minimal polygons" (not necessarily boundaries, as they overlap and leave gaps). These polygons turned out as a highly useful tool to visualize some qualities of historical constellations (e.g. the fact that they occasionally overlap), to indicate rough areas when historical information is missing to accurately paint something and stick figures would be mere phantasy etc. ... (my Chinese map is used by the IAU, the Greek and Babylonian map were presented at the annual meeting of the German Astronomical Society 2015) It can be considered as a "tool for digital humanities" although the DH doesn't know about it yet. ;) Hence, I told Youla to automatically compute the convex hull in the Sky Culture Maker, so that the user actually doesn't have to care for it - but we can use it later for historical research (and we can output all stars within the constellation area from any given modern star catalogue, not only those stars that are mentioned by the historical author -> is important for the encyclopaedia). Q: Are these convex hull-polygons boundaries??? in a way yes - but they are artificially constructed/ a modern tool and not in the historically data... hm ... So, this will probably add yet another object to the JSON file. c) the Arabic "lunar stations" are currently defined as rectangles. This is a brilliant idea and works perfectly with the current definition; screenshot: However, they are currently a standalone SC but it would be better to have them (or sth. like this) in the Arabic SCs as "boundaries" (this is not a technical problem, only a historical one: we just have to do it and therefore make sure that as-Sufi and other also have uses them in this way). WORKS, perfectly. <3 In reply to a user request, I aim to define the sections of the Seleucid Babylonian zodiac (also a coordinate system) in the same way. This may, then, also work for other zodiac divisions (see Bode map above) and other coordinate systems. Extension: In historical China, there was the concept of the "Lunar Mansions" (LM) that are, in fact, a coordinates system = RA-slices and (for the confusion of the class enemy) have nothing to do with the moon. With Sinologists & Chinese historians of astronomy (and our contributors), we all agree that the LM are something like boundaries. The map in our "Chinese Mediaeval" SC actually displays them drawn - and it's one of the earliest preserved maps of humanity, i.e. all other maps also have it (the earliest, the Dunhuang doesn't. never mind). In the map (see screenshot where I try to catch both) they are vertical lines, so we defined them as vertical (RA)-lines in Stellarium. Challenge: the LM are historically defined by specific stars - and as you know the stars change their RAs over time (i.e. in principle we would have to change the LM-line when the star shifts). So, what we did was using the RA of the map's creation date. However, when an astrophysicist in search for a historical nova uses this map in a different time, say ~200 years later because those astronomers have used the same constellations, our hard-coded RA-boundaries won't match any more: they will be at RA-earlier date and then the RA-slices do not meet in the current (equatorial) pole (which doesn't make sense when considering the northern circumpolar constellations and their LM). Thus, we would need the option to define boundaries attached to the stars (like stickfigures). That means: Imagine we have a Sky Culture Maker, ;) then, the user would click a button for "create boundaries" and then choose between the options (a) orthogonal coordinates + epoch (which works for IAU and Arabic), (b) cloudy line (like a const. artwork drawing), or (c) lines anchored at stars. 2. Making of Sky Cultures,I think, I would like to offer a solution in a different direction: I would separate the problems of (a) what the content contributor needs (as @gzotti remarked "some years later, I need my comments") and (b) what the user needs. I would store in JSON what needs to be displayed (b) and I would store in a sort of website-like documentation what the contributor needs (a). Brave idea: our beautiful window which currently just displays "description.lang.md": can we provide it with another register (a 2nd register, below the "sky- sso - dso ..." one) so that the first window displays the information for the user and the second register displays the info for some sophisticated users/ colleagues and the author? Youla's new "Sky Culture Maker" could output two files "description.lang.md" and "technical_notes.lang.md" locally for the user while working on the new SC, and than Ruslan's converter will then translate/ merge into the JSON format. This way, Youla's browser tool may serve as a a GUI for Ruslan's app. 3. HTML is good, I think ... I'm not sure what people will find more common/ more useful |
yet another thought - don't know if I said this already:
|
adding to my thought about the name labels: do we allow all unicode characters?
=> please let's define exactly what we want! as said above, ideally, we would allow unicode (with all diacritica). |
yet another question (without recommendation, only a question): Are we sure we want to stick to the artwork definition with not more and not less than 3 points as anchors? I remember that @gzotti and I more than one time spoke about that issue because it is not really convenient. If I remember correctly, he sometimes said that once reworking the SCs, we should offer a flexible number of anchor points. However, this would mean that Stellarium also has to do the image processing of distorted figures. Hence, as much as I in general agree that it would be more convenient for the person who defines the SC: the end user would probably not care and it may (or not) increase the amount of computations. Here is another idea: As we are currently developing a "Sky Culture Maker", probably the better option would be to do the image processing with that (I suspect Youla will then need your help with software development because I myself don't know/have never tried to write an image processing software)? |
It's perfectly possible to skip a field if it's not known. For the name we already have a "pronounce" field supposed to contain the english transliteration, or let's say an ascii representation of the name. See for example: For the alternative names, it's something we might want to add even though it's already almost supported for other common names when we define a list of common names. We might need to accept a
I think it'a already all in there
|
Yes I think we should allow all unicode, but some implementations might have issues displaying them. For example Stellarium web doesn't support most unicode text right now.. |
Is this just a problem with a font not including all required glyphs or a general JS issue? On the 3-point match: I think I asked Fabien for something better around 2011 :-) Probably we could work out a workflow involving a much simpler star catalog like BSC in a free GIS (QGIS) that allows georeferencing raster images with several match points (up to a rubberbanding solution). The raster could then be re-projected to (most probably) stereographic centered on its center of gravity (however you would define that automatically) and exported to a new raster where the 3-star match works better. Processing figures in this way is not difficult, just time consuming. Add this to cleaning up the copperplate (or other artwork) scans first... I started it once (2012?) in ArcGIS with some 20th-century atlas scans which however I must not republish before 2033 :-( |
great
ok, fine... (Head-Hair lunar mansion: "xiu" means "LM" ;) ). If you think, a fourth field would be empty in most of the SCs, it might be only a Babylonian problem (because due to the >2000 years, we have two languages, Akkadian and Sumerian), so I will have to find an individual solution for this
yes, I just wanted to add that however we change the format/ field-names above, we better do the same here :-) |
|
re BSC as "simpler". I just used BSC with its 9000 or so stars as easily preprocessable XY table for scaled star icons in a GIS to have something to place drawings against. I did not go for combining naked-eye magnitudes of close binaries in the GIS. (But did just that for my PostScript star maps, manually...). I just said we could develop a (Q)GIS based recommended workflow to adjust too far distorted drawings (artwork done in a highly distorting projection) into something that then can be linked successfully with 3 stars. It is just expected to be a tedious process for which I never had the time. Likewise I did not mean to develop or include any image cleaning editor. We digress here, of course this is another preprocessing step we expect contributors to do. |
agree |
This set of commits switches Stellarium to the new format of sky cultures used in stellarium-skycultures repo.
The old format is no longer supported, but a tool is provided (
util/skyculture-converter
) that helps convert an old culture to the new one (with a limited support for conversion of the description, mostly retaining HTML and only changing the heading structure to more or less follow the spec of the new format).The sky cultures from the sky cultures repo are imported using a script,
skycultures/update-skycultures.py
.Among the structural changes to this repo are:
skycultures/common_dso_names.fab
andskycultures/common_star_names.fab
now contain the common names that used to reside inmodern_iau
culture.po/stellarium-skycultures
now keeps translations of culture-specific names, while the common names are translated inpo/stellarium-sky
.po/stellarium-skycultures-descriptions
..po
entry per section.modern
culture that I converted to the new format and pushed into that repo, for compatibility with the Stellarium default.