Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

同步文件夹路径含中文时,实际同步到非预期的路径 #576

Closed
binsee opened this issue Oct 30, 2020 · 2 comments · Fixed by rime/librime#806
Closed

同步文件夹路径含中文时,实际同步到非预期的路径 #576

binsee opened this issue Oct 30, 2020 · 2 comments · Fixed by rime/librime#806

Comments

@binsee
Copy link

binsee commented Oct 30, 2020

weasel版本:0.14.3
系统版本:win10 x64

问题描述

installation.yaml 中配置 sync_dir 以将配置同步到指定目录。
但当指定的目录路径中包含中文(非ansi字符)时,实际同步到的路径是一个乱码路径,而非配置文件中指定路径。

详细

  • 配置文件:installation.yaml
# encoding: utf-8

distribution_code_name: Weasel
distribution_name: "小狼毫"
distribution_version: 0.14.3
install_time: "Sat Oct 31 01:25:18 2020"
installation_id: "MyPC"
rime_version: 1.5.3
sync_dir: 'E:\Documents\坚果云\RimeSync'
  • 执行同步

  • 实际同步到的路径: E:\Documents\鍧氭灉浜慭RimeSync\MyPC
    image

  • 而将配置文件以 ANSI 编码保存,则可以同步到预期路径

分析

根据测试如下:

// utf-8编码
E:\Documents\坚果云\RimeSync

// 以ANSI编码显示
E:\Documents\鍧氭灉浜慭RimeSync
  • rime的yml默认都是utf-8编码,因此修改installation.yaml 时默认也是utf-8
  • librime从 installation.yaml 读取路径数据,但生成路径时却是将路径当作ANSI编码处理。
  • 当路径字符串包含中文且以utf-8编码,直接当作ANSI编码使用时,便会导致乱码

暂时的解决方式

  • 同步文件夹路径不要包含中文
  • installation.yaml 文件以ANSI编码进行保存

补充

由于weasel是调用librime的api来执行的同步操作,因此问题代码在librime中,此issue或许应转移至librime。
但由于不熟悉c++,怀疑是否是属于编译方面问题或其他,因此发在weasel中,请维护者给予判断,是否应转移到librime。

@lotem
Copy link
Member

lotem commented Nov 2, 2020

同意分析和暂时的解决方式。
librime全部用UTF-8编码,到平台编码的软换应该由前端处理。

@Qeynos
Copy link

Qeynos commented Dec 18, 2023

另外包括扩展字典名称中出现中文文件名也会无法引用,毕竟我们是个主要用于输入中文的输入法,稍有怪异

lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 4, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
graphemecluster pushed a commit to TypeDuck-HK/librime that referenced this issue Mar 18, 2024
refactor: convert path to native encoding on Windows

feat(rime_api): provide secure version of path getter functions `RimeApi::get_*_dir_s`.

Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.

Closes rime#804
Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: Most `string` filenames in APIs are changed to `path`;
`installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.

Details of the code refactor

Wrap `std::filesystem::path` in a thin wrapper class `rime::path` which calls `std::filesystem::u8path` in the constructor on Windows.

Operator `/=` and `/` are also overloaded to convert the right operand from UTF-8 string to native path.

Follow these rules to apply correct conversion between `string` and `rime::path`:

- construct `rime::path` with UTF-8 encoded string;
- get native string by `path::u8string`;
- to extract UTF-8 string from `path`, for example to find schema ID from file name, call `path::u8string`;
- avoid implicit conversion from string, which results in `std::filesystem::path` without performing UTF-8 to native conversion;
- explicitly construct `rime::path` from `std::filesystem::path` before append operation, to ensure the overloaded operator with string conversion is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants