Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug causes error uploading to huggingface, unicode issue on windows. #450

Merged
merged 1 commit into from
Nov 22, 2024

Conversation

resolver101757
Copy link
Contributor

@resolver101757 resolver101757 commented Sep 21, 2024

What this does

This PR resolves a bug in the LeRobot repository that causes an error when uploading a dataset to Huggingface on Windows due to a Unicode encoding issue. The problem arises because Windows, by default, uses cp1252 encoding, which cannot handle certain Unicode characters, such as emojis. The error is caused by the dataset card containing an emoji.

Error trace:

The error produced is 'Traceback (most recent call last):
File "D:\dev\lerobot\lerobot\scripts\control_robot.py", line 885, in
record(robot, **kwargs)
File "D:\dev\lerobot\lerobot\scripts\control_robot.py", line 681, in record
push_dataset_card_to_hub(repo_id, revision="main", tags=tags)
File "D:\dev\lerobot\lerobot\scripts\push_dataset_to_hub.py", line 124, in push_dataset_card_to_hub
card.push_to_hub(repo_id=repo_id, repo_type="dataset", revision=revision)
File "C:\Users\tblocal.conda\envs\lerobot\lib\site-packages\huggingface_hub\repocard.py", line 277, in push_to_hub
tmp_path.write_text(str(self))
File "C:\Users\tblocal.conda\envs\lerobot\lib\pathlib.py", line 1155, in write_text
return f.write(data)
File "C:\Users\tblocal.conda\envs\lerobot\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f917' in position 102: character maps to

This fix removes the emoji 🤗 which prevents UnicodeEncodeError.

How it was tested

  • The script was run using the following command:

python lerobot/scripts/control_robot.py record --robot-path lerobot/configs/robot/koch_original.yaml --fps 30 --root data --repo-id RESOLVER101757/koch_test_2 --tags tutorial --warmup-time-s 5 --episode-time-s 30 --reset-time-s 30 --num-episodes 1

  • The run proceeded without encountering the UnicodeEncodeError, confirming that the issue is resolved.

How to checkout & try? (for the reviewer)

  1. Pull this branch.
  2. Run the record command:

python lerobot/scripts/control_robot.py record --robot-path lerobot/configs/robot/koch_original.yaml --fps 30 --root data --repo-id RESOLVER101757/koch_test_2 --tags tutorial --warmup-time-s 5 --episode-time-s 30 --reset-time-s 30 --num-episodes 1

  1. Verify that the command runs without the UnicodeEncodeError, and the dataset uploads successfully.

@resolver101757 resolver101757 changed the title bug causes error uploading to huggingface bug, unicode issue on windows. bug causes error uploading to huggingface, unicode issue on windows. Sep 21, 2024
Copy link
Collaborator

@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Cadene Cadene merged commit 20f4667 into huggingface:main Nov 22, 2024
@Cadene
Copy link
Collaborator

Cadene commented Nov 22, 2024

@resolver101757 If you find any other incompatibility issue, please let me know. I will be more reactive this time!!!

villekuosmanen added a commit to villekuosmanen/lerobot that referenced this pull request Dec 30, 2024
* feat: enable to use multiple rgb encoders per camera in diffusion policy (huggingface#484)

Co-authored-by: Alexander Soare <[email protected]>

* Fix config file (huggingface#495)

* fix: broken images and a few minor typos in README (huggingface#499)

Signed-off-by: ivelin <[email protected]>

* Add support for Windows (huggingface#494)

* bug causes error uploading to huggingface, unicode issue on windows. (huggingface#450)

* Add distinction between two unallowed cases in name check "eval_" (huggingface#489)

* Rename deprecated argument (temporal_ensemble_momentum) (huggingface#490)

* Dataset v2.0 (huggingface#461)

Co-authored-by: Remi <[email protected]>

* Refactor OpenX (huggingface#505)

* Fix missing local_files_only in record/replay (huggingface#540)

Co-authored-by: Simon Alibert <[email protected]>

* Control simulated robot with real leader (huggingface#514)

Co-authored-by: Remi <[email protected]>

* Update 7_get_started_with_real_robot.md (huggingface#559)

* LerobotDataset pushable to HF from any folder (huggingface#563)

* Fix example 6 (huggingface#572)

* fixing typo from 'teloperation' to 'teleoperation' (huggingface#566)

* [vizualizer] for LeRobodDataset V2 (huggingface#576)

* Fix broken `create_lerobot_dataset_card`  (huggingface#590)

* feat(act): support training end of episode token to ACT model

* changes

* feat(arx): add arx arm (#2)

* feat(arx): support arx arm

* changes

* changes

* changes

* changes

* pass pipes explicitly

* changes

* us ndarray over a pipe

* changes

* changes

* replay basically works

* patch arx sdk

* changes

* support cameras in arx5

* rename to arx5

* kind of works

* changes

* changes

* changes

* various changes

* changes

* revert a few changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* remove TODO

* allow multiple tasks

---------

Signed-off-by: ivelin <[email protected]>
Co-authored-by: Hirokazu Ishida <[email protected]>
Co-authored-by: Alexander Soare <[email protected]>
Co-authored-by: Arsen Ohanyan <[email protected]>
Co-authored-by: Ivelin Ivanov <[email protected]>
Co-authored-by: Daniel Ritchie <[email protected]>
Co-authored-by: resolver101757 <[email protected]>
Co-authored-by: Jannik Grothusen <[email protected]>
Co-authored-by: KasparSLT <[email protected]>
Co-authored-by: Simon Alibert <[email protected]>
Co-authored-by: Remi <[email protected]>
Co-authored-by: Michel Aractingi <[email protected]>
Co-authored-by: Simon Alibert <[email protected]>
Co-authored-by: berjaoui <[email protected]>
Co-authored-by: Claudio Coppola <[email protected]>
Co-authored-by: s1lent4gnt <[email protected]>
Co-authored-by: Mishig <[email protected]>
Co-authored-by: Eugene Mironov <[email protected]>
villekuosmanen added a commit to villekuosmanen/lerobot that referenced this pull request Jan 10, 2025
* feat: enable to use multiple rgb encoders per camera in diffusion policy (huggingface#484)

Co-authored-by: Alexander Soare <[email protected]>

* Fix config file (huggingface#495)

* fix: broken images and a few minor typos in README (huggingface#499)

Signed-off-by: ivelin <[email protected]>

* Add support for Windows (huggingface#494)

* bug causes error uploading to huggingface, unicode issue on windows. (huggingface#450)

* Add distinction between two unallowed cases in name check "eval_" (huggingface#489)

* Rename deprecated argument (temporal_ensemble_momentum) (huggingface#490)

* Dataset v2.0 (huggingface#461)

Co-authored-by: Remi <[email protected]>

* Refactor OpenX (huggingface#505)

* Fix missing local_files_only in record/replay (huggingface#540)

Co-authored-by: Simon Alibert <[email protected]>

* Control simulated robot with real leader (huggingface#514)

Co-authored-by: Remi <[email protected]>

* Update 7_get_started_with_real_robot.md (huggingface#559)

* LerobotDataset pushable to HF from any folder (huggingface#563)

* Fix example 6 (huggingface#572)

* fixing typo from 'teloperation' to 'teleoperation' (huggingface#566)

* [vizualizer] for LeRobodDataset V2 (huggingface#576)

* Fix broken `create_lerobot_dataset_card`  (huggingface#590)

* Update README.md (huggingface#612)

* Fix Quality workflow (huggingface#622)

* fix(docs): typos in benchmark readme.md (huggingface#614)

Co-authored-by: Simon Alibert <[email protected]>

* fix(visualise): use correct language description for each episode id (huggingface#604)

Co-authored-by: Simon Alibert <[email protected]>

* typo fix: batch_convert_dataset_v1_to_v2.py (huggingface#615)

Co-authored-by: Simon Alibert <[email protected]>

* [viz] Fixes & updates to html visualizer (huggingface#617)

* fixes to SO-100 readme (huggingface#600)

Co-authored-by: Philip Fung <no@one>
Co-authored-by: Simon Alibert <[email protected]>

---------

Signed-off-by: ivelin <[email protected]>
Co-authored-by: Hirokazu Ishida <[email protected]>
Co-authored-by: Alexander Soare <[email protected]>
Co-authored-by: Arsen Ohanyan <[email protected]>
Co-authored-by: Ivelin Ivanov <[email protected]>
Co-authored-by: Daniel Ritchie <[email protected]>
Co-authored-by: resolver101757 <[email protected]>
Co-authored-by: Jannik Grothusen <[email protected]>
Co-authored-by: KasparSLT <[email protected]>
Co-authored-by: Simon Alibert <[email protected]>
Co-authored-by: Remi <[email protected]>
Co-authored-by: Michel Aractingi <[email protected]>
Co-authored-by: Simon Alibert <[email protected]>
Co-authored-by: berjaoui <[email protected]>
Co-authored-by: Claudio Coppola <[email protected]>
Co-authored-by: s1lent4gnt <[email protected]>
Co-authored-by: Mishig <[email protected]>
Co-authored-by: Eugene Mironov <[email protected]>
Co-authored-by: CharlesCNorton <[email protected]>
Co-authored-by: Philip Fung <[email protected]>
Co-authored-by: Philip Fung <no@one>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants