Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

日本新字體 #494

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

日本新字體 #494

wants to merge 14 commits into from

Conversation

danny0838
Copy link
Contributor

@danny0838 danny0838 commented Jul 10, 2020

  • t2jp 方案比照 t2s 採兼容「OpenCC 標準體轉日本新字體」及「日本舊字體轉日本新字體」的做法,因此除了 OpenCC 標準體轉日本新字體以外,也補上《常用漢字表》等表舊字體轉新字體的部分。
  • 擴充轉換移至 JPVariantsEx.txt,包括《表外漢字字體表》非簡慣優先的簡慣字體(簡慣優先的直接放 JPVariants.txt)及擴張新字體。擴張新字體與日本標準無衝突的直接轉換,有衝突的預設不轉換,只作為第二候選字。預設轉換方案 t2jp 不包含 JPVariantsEx.txt,並增加包括擴充轉換的 t2jpx 方案。jp2t 則包括 JPVariantsEx.txt 的逆轉換。
  • 擴張新字體清單主要沿用 新增日本新字體。 #371,額外加了幾字。
  • 「龝」雖是「秋」的異體字而非 OpenCC 標準字,但考量 t2jpx 也包括舊字體轉新字體,源文本未必是嚴格的 OpenCC 標準字,因此仍予保留。

@danny0838 danny0838 force-pushed the 日本新字體 branch 18 times, most recently from b593869 to 8cc894b Compare July 12, 2020 14:05
@BYVoid
Copy link
Owner

BYVoid commented Jul 13, 2020

「非簡慣優先的簡慣字體」是什麼

@danny0838
Copy link
Contributor Author

danny0838 commented Jul 13, 2020

「非簡慣優先的簡慣字體」是什麼

日本2000年的《表外漢字字體表》列出了印刷標準用字,有些字還附有簡易慣用字體(簡慣字體)。

大部分字是印刷標準用字為標準,簡易慣用字體為可接受的變體;但「曽」「痩」「麺」三字例外,以簡慣字體優先,2010年的《改定常用漢字表》也加收此三字作為標準用字。

所謂「非簡慣優先的簡慣字體」,就是指並非以簡慣字體優先的字的簡慣字體,比如「醤」「鹸」。由於它們只是「可接受的變體」而非標準字體,因此預設轉換方案不轉換,但擴充轉換方案 t2jpx 的邏輯既然是「盡可能多使用新字體、類推字」,因此也對它們做轉換。

@danny0838 danny0838 force-pushed the 日本新字體 branch 2 times, most recently from 3eff2be to 9b0ec92 Compare July 16, 2020 12:28
danny0838 and others added 6 commits July 21, 2020 21:14
- 《表外漢字字體列表》以印刷標準字體為主,簡易慣用字體亦可接受,故取消強制轉換,移至 JPVariantsEx.txt。(明訂簡慣優先的且收錄於《改定常用漢字表》的「曽」「痩」「麺」除外)
- 預設轉換方案 t2jp 不包括 JPVariantsEx.txt,另外增加包括擴充轉換的 t2jpx 方案。jp2t 則包括還原擴充轉換。
- 「龝」為「秋」之異體字,為完整支援舊字體轉新字體而予保留
@maxmellen
Copy link

@BYVoid Any chance of seeing something like this merged?

I understand that you don't think non-BPM 擴張新字體 should be part of the t2jp preset, but I think having this separate t2jpx preset is a great way to separate "converting to the Japanese Standard" and "using Japanese shorthands as much as possible".
Either way, I would love to see a tool that lets me use 日本擴張新字體 as an alternate simplification scheme to 大陸簡化字.

@danny0838 Have you been using a fork in the meantime?

@ayaka14732
Copy link
Collaborator

I am now developing StarCC, the next generation of OpenCC.

@danny0838 Could you make a PR there? We can work together on this project.

@danny0838
Copy link
Contributor Author

@ayaka14732 We are overloaded and probably won't be able to handle the cross-project compatibility shortly. You can port them from our project sts-lib, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants