-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dist/tools/usb_serial: Add tool for listing and filtering TTY interfaces #17737
Conversation
Nice, I like. This depends on pyudev (which is not common, as found by my survey with N=1). How do other scripts we provide deal with that? (It is in Debian as python3-pyudev, and has been for ages, in case that matters). It provides a bit less information than list-ttys.sh when it comes to the model; maybe that can be enhanced? For comparison:
vs.
where "ARM" and "BBC micro:bit CMSIS-DAP" come from /sys/bus/usb/devices/1-1/manufacturer and product, respectively. Seeing the "micro:bit" part would be good because that would pave the way for setting them per board, and making To avoid having too many tools I'd suggest we ensure this has feature parity with find-tty.sh and list-ttys.sh, and to deprecate them (ideally already when merging this). |
I added I also no preferred It might be sensible to check for a bunch of boards if there are more attributes available via udev that contain insightful information. I guess I found an excuse to undust a bunch of boards for research :) |
I also no preferred `ID_MODEL` over `ID_MODEL_FROM_DATABASE` (and same
for vendor) as a wild guess that this might spill out the `micro:bit`.
I will test later at home, if that guess was correct.
It is; now I also see "ARM" as vendor string and
"_BBC_micro:bit_CMSIS-DAP_" as model.
The underscores are a bit funny; it appears that udev is a bit
overprotective and serves the data in a "funny-character-safe" way by
default (whatever the precise rules are -- I consider this a bad
practice, byte strings should be treated as first class objects by
system libraries without regard for a particular language's or shell
script's or configuration tools' prefererences).
The Python bindings apparently do not give good access to the original
value -- using ID_MODEL gives "_BBC_micro:bit_CMSIS-DAP_" and using
ID_MODEL_ENC gives "\x22BBC\x20micro:bit\x20CMSIS-DAP\x22". pyudev has
nothing that deals with this encoding,
Lacking my preferred option (telling pyudev to just give us the byte
string), options are to
1. go with the current underscore format (showing people underscores in
weird places),
2. go with the ecaped format (showing people even \x20 in weird places)
3. escape it back to the original byte string
The trouble is that while we can display the binary strings (in the
table using `.decode('utf8', error='replace')`, on single-property
output using `sys.stdout.raw.write()`), the JSON output is not so
friendly (and an hour ago I was *joking* about using CBOR...).
Bottom line is that it's probably the easiest to just do 1 now (leaving
things as they are), and tip the hat to udev's dominance. The best
migration plan we would have if they ever change their string
simplification rules would be to introduce some fuzzy (possibly
regexp, possibly just glob) matching, but probably it causes them as
much trouble as us, and they'll just stick with this forever.
…--
To use raw power is to make yourself infinitely vulnerable to greater powers.
-- Bene Gesserit axiom
|
I added the decoding of the unicode escape stuff udev provides. The
I also added a bit of logic to just detect the smallest column size fitting each column, rather than shortening. That way the table will never break, but also fit small terminals if the data to show fits. |
Btw: I flashed the J-Link firmware on all of my micro:bits, so that I see a |
dist/tools/usb-serial/ttys.py
Outdated
Decodes unicode escaping in a string, e.g. "Hallo\\x20World" is decoded as | ||
"Hallo World" | ||
""" | ||
return bytes(string, "utf-8").decode("unicode_escape") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return bytes(string, "utf-8").decode("unicode_escape") | |
return bytes(string, "ascii").decode("unicode_escape").encode('latin1').decode('utf8', error='replace') |
The 'utf-8' in the original version is a red herring: the input is ASCII per the (presumed) escaping rules. Also the "unicode_escape" is misleading (but Python has no equivalent "c_escape"): It would properly unescape any \u1234' strings, but what's actually there is
\x89`, and that's part of Python's still weird admittance of latin1's special position in the encoding world that got grandfathered into Unicode by the assignment of code points 128-255.
That all result is the mojibake you see on Universit*t.
The proposed changes are still lossy (a bit like the underscores), but in a better defined way (precisely those characters that are not valid UTF-8 are replaced), and still easy to enter (you can enter regular text, or copy-paste the output around).
This loses bytes that are not UTF-8, which could in theory be salvaged through surrogate escaping, but if someone's device is named "foo\xffbar", that's not because their whole system is using a codec where \xff is valid, but just because someone screwed up in firmware development. (Not that there'd be a rule that these things are UTF-8, but anything else is just so 20th century).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gives me
$ make list-ttys
Traceback (most recent call last):
File "/home/maribu/Repos/software/RIOT/dist/tools/usb-serial/ttys.py", line 220, in <module>
print_ttys(sys.argv)
File "/home/maribu/Repos/software/RIOT/dist/tools/usb-serial/ttys.py", line 205, in print_ttys
tty = tty2dict(dev)
File "/home/maribu/Repos/software/RIOT/dist/tools/usb-serial/ttys.py", line 36, in tty2dict
result["vendor"] = unescape(dev.get("ID_VENDOR_ENC"))
File "/home/maribu/Repos/software/RIOT/dist/tools/usb-serial/ttys.py", line 19, in unescape
res = bytes(string, "ascii").decode("unicode_escape")
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 18: ordinal not in range(128)
make: *** [/home/maribu/Repos/software/RIOT/tests/periph_timer_periodic/../../Makefile.include:867: list-ttys] Error 1
Using bytes(string, "utf8")
instead of bytes(string, "ascii")
gives me:
$ make list-ttys
path | driver | vendor | model | model_db | serial | ctime
-------------|----------|--------------------------|--------|------------------------|----------|---------
/dev/ttyUSB0 | ftdi_sio | Freie Universität Berlin | MSB-A2 | FT232 Serial (UART) IC | ARR6DKEE | 20:21:22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's
- funny because it means that udev is letting high bytes go through unescaped (while escaping even whitespace with \x20)
- weird because I've just tested it and when I enter UTF-8 in a RIOT USB device's name that conversion makes it break ...
... except that I just found that USB identifiers are UTF-16 (or UCS-2?); reading up a little to ensure that we don't nail down a conversion here that turns out to be broken just because it happens to work at the Universität device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Umlaut actually looks pretty UTF-8 like to me:
$ udevadm info /dev/ttyUSB0
P: /devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2:1.0/ttyUSB0/tty/ttyUSB0
N: ttyUSB0
S: serial/by-id/usb-Freie_Universität_Berlin_MSB-A2_ARR6DKEE-if00-port0
S: serial/by-path/pci-0000:00:14.0-usb-0:2:1.0-port0
E: DEVLINKS=/dev/serial/by-id/usb-Freie_Universität_Berlin_MSB-A2_ARR6DKEE-if00-port0 /dev/serial/by-path/pci-0000:00:14.0-usb-0:2:1.0-port0
E: DEVNAME=/dev/ttyUSB0
E: DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-2/1-2:1.0/ttyUSB0/tty/ttyUSB0
E: ID_BUS=usb
E: ID_MODEL=MSB-A2
E: ID_MODEL_ENC=MSB-A2
E: ID_MODEL_FROM_DATABASE=FT232 Serial (UART) IC
E: ID_MODEL_ID=6001
E: ID_PATH=pci-0000:00:14.0-usb-0:2:1.0
E: ID_PATH_TAG=pci-0000_00_14_0-usb-0_2_1_0
E: ID_PCI_CLASS_FROM_DATABASE=Serial bus controller
E: ID_PCI_INTERFACE_FROM_DATABASE=XHCI
E: ID_PCI_SUBCLASS_FROM_DATABASE=USB controller
E: ID_REVISION=0600
E: ID_SERIAL=Freie_Universität_Berlin_MSB-A2_ARR6DKEE
E: ID_SERIAL_SHORT=ARR6DKEE
E: ID_TYPE=generic
E: ID_USB_DRIVER=ftdi_sio
E: ID_USB_INTERFACES=:ffffff:
E: ID_USB_INTERFACE_NUM=00
E: ID_VENDOR=Freie_Universität_Berlin
E: ID_VENDOR_ENC=Freie\x20Universität\x20Berlin
E: ID_VENDOR_FROM_DATABASE=Future Technology Devices International, Ltd
E: ID_VENDOR_ID=0403
E: MAJOR=188
E: MINOR=0
E: SUBSYSTEM=tty
E: USEC_INITIALIZED=7836843200
Using
def unescape(string):
"""
Decodes unicode escaping in a string, e.g. "Hallo\\x20World" is decoded as
"Hallo World"
"""
return string.replace("\\x20", " ")
Yields:
path | driver | vendor | model | model_db | serial | ctime
-------------|----------|--------------------------|--------|------------------------|----------|---------
/dev/ttyUSB0 | ftdi_sio | Freie Universität Berlin | MSB-A2 | FT232 Serial (UART) IC | ARR6DKEE | 21:17:33
If udev really only replaces spaces and invalid stuff, restoring spaces would be the only thing needed. Maybe I should find some documentation on this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've verified that the current conversion produces correct output even when non-UCS2 model names are used; furthermore, the output is consistent with the output of lsusb.
The incantation looks weird in terms of locales, but that's just because Python's decode('unicode_escape') makes a hardcoded choice of latin1 for bytes that are present in the string in unescaped form, and that gets reverted in the latin1 encode step.
I think this is fine (but will need to make up my mind about whether I want to clean up the crude UTF-16 support I hacked into _cpy_str_to_utf16 for inclusion in RIOT :-) ).
If we at any point happen to develop (or port) a JTAG probe on (to) RIOT, let's make sure both its over-the-network and USB identifiers are such that they can indicate when they're hardwired to any given board and indicate that :-) |
OK, I think I addressed all comments. I also exposed the DB entry for vendor and model, as they can be vastly different from the reported model and vendor name. I could imagine that a debugger would report the board it is hardwired to in the model string, while the DB entry would also give the type of the debugger. There might be use cases to filter for either of them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it could replace list-ttys.sh (and find-tty-.sh) in a few places, but that can also be done after the tool is established. (Easy replacements are various READMEs that offer list-ttys.sh to find serial numbers -- given it's not used anywhere else, and its most common mention is as a known bug in the release notes and one more bug from non-GNU find use, I think it could be deprecated immediately).
A few more comments, but other than that I think it's good.
if args.most_recent: | ||
most_recent = ttys[0] | ||
for tty in ttys: | ||
if tty["ctime"] > most_recent["ctime"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a curious observation: Dual devices like hifive1 the below have exactly the same timestamp (down to the nanosecond, so probably created atomically). The current mechanism takes the first of them in the sequence returned from pyudev, which I think will be fine.
path | driver | vendor | model | model_db | serial | ctime
-------------|----------|--------|---------------|-------------------------------|--------|---------
/dev/ttyUSB0 | ftdi_sio | FTDI | Dual RS232-HS | FT2232C/D/H Dual UART/FIFO IC | None | 13:19:40
/dev/ttyUSB1 | ftdi_sio | FTDI | Dual RS232-HS | FT2232C/D/H Dual UART/FIFO IC | None | 13:19:40
If at some point we encounter devices where any other TTY than the first is used, a filter for that can still be added (I wouldn't do that now when there is no use case yet).
Alternative Tool: ttys.py | ||
------------------------- | ||
|
||
As an alternative to above shell script, `$(RIOTTOOLS)/usb-serial/ttys.py` can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above is already ttys.py, I think this can be unified.
board: | ||
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{Makefile} | ||
PORT ?= $(shell $(RIOTTOOLS)/usb-serial/ttys.py --most-recent --format path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it still make sense to have that snippet around now that MOST_RECENT_PORT is available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to come up with a more sensible example.
fde8de4
to
360382a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK: this is correct for all I can tell, and an improvement to the workflows.
- Provide a new tool to list and filter TTYs - Change `Makefile.include` to use `$(RIOTTOOLS)/usb-serial/ttys.py` instead of `$(Q)$(RIOTTOOLS)/usb-serial/list-ttys.sh` to implement `make list-ttys` - Extend `makefiles/tools/serial.inc.mk` to allow using the most recent port by passing `MOST_RECENT_PORT=1` as environment variable or parameter to make Co-authored-by: chrysn <[email protected]> Co-authored-by: Koen Zandberg <[email protected]>
360382a
to
b296ade
Compare
Due to the change of the backend for |
All green :) |
Contribution description
This provides
dist/tools/usb-serial/ttys.py
to list and filter TTY interfaces. I just give some example invocations to show the features:This is wired up to
make list-ttys
.Additionally,
make MOST_RECENT_PORT=1 term
should now connect to the most recently connected board.Testing procedure
Run the script for different parameters and check if it works as advertised.
Issues/PRs references
None