Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing with a Model 100 - various glitches #3

Open
dawidi opened this issue Apr 30, 2020 · 13 comments
Open

Testing with a Model 100 - various glitches #3

dawidi opened this issue Apr 30, 2020 · 13 comments

Comments

@dawidi
Copy link

dawidi commented Apr 30, 2020

I've used your terminfo file when connecting a recently acquired Model 100 to a current (Arch) Linux box via a Null-modem cable and an RS232-USB adapter cable. (Just out of curiosity about vintage tech, not that I'd need it for anything.)

It works in principle, and certainly much better than with the default terminal type, but there are still glitches, at least some of which are presumably due to the terminfo still not being an exact match for the M100's terminal client:

  • showkey -a indicates the correct values for cursor key presses, but e.g. in bash (or readline in general) only cursor-up works, all other directions produce a beep. This seems like it would be the easiest to fix, but I have no idea why it doesn't work already.
  • Reverse text works, but intense/standout/bright text generated by various commands (like man, less or pacman) result in escape sequences being printed verbatim, i.e. "0;1m" or "0m" or "1;34m::0;1m" sprinkled in between the expected text output. Is that a program ignoring the terminfo, or a missing setting in the terminfo to clarify that those sequences are not to be used?

Am I doing anything wrong? If not, are you interested in fixing those two? Can I do anything to help?

And then, related but probably not within scope of this project:

  • Some fullscreen programs like Midnight Commander spray the terminal with a ton of control characters, lock up and leave the session in an unusable state; apparently mc uses a framework other than ncurses, maybe that one has its own hardcoded terminal types, that's probably not fixable with a terminfo
  • What about conversion between the M100's internal character set and an utf-8 capable host? I wrote a conversion table (as described in the libiconv docs, line 335+, especially for modified latin characters) but I have no idea how to continue from there, or if that is even the right approach.
@hackerb9
Copy link
Owner

hackerb9 commented May 4, 2020

Thank you for the detailed and helpful bug report! I'm glad you've found this useful.

I have added information in the README file that tells people how to get the arrow keys working. I also added a section about problems with programs that send incorrect escape sequences (usually VT220 color setting). I do not see a specific terminfo setting to say, "Don't show colors"; each application seems to have its own way of doing it. :-(

I do not know about the iconv suggestion, but please do share your conversion table. Perhaps one could write a C program that sits between you and the tty (using a pty) and does the iconv conversion. On the other hand, it might be easier to modify TELCOM to recognize a few UTF-8 sequences.

@dawidi
Copy link
Author

dawidi commented May 5, 2020

Progress :-)

Your readme section on .inputrc has forward-char and backward-char the wrong way round, otherwise adding this file has indeed sorted out the navigation issues in bash.
Also you could add "\x7F": delete-char to make SHIFT-BKSP on the Model 100 do forward deletion as intended.

Charset thoughts:
For now, here's my attempt at a conversion list - it should be a reversible transformation, i.e. converting a Tandy file to UTF-8 and back to the Tandy charset should yield exactly the original bytes again. But not all of the Tandy's custom characters have a perfect match in Unicode, some of them I have no idea what they even depict. (And can you believe Unicode, with all its "skin-color-aware confused grin" Emojis doesn't even have a single "stick figure" codepoint? The Model 100 has two!)
I'll experiment with making a small conversion program to sort out the charset next (certainly easier than understanding how to modify and recompile libiconv). Not sure if one can wire up the login session through that, but at the very least it'll be able to convert text files.

@hackerb9
Copy link
Owner

hackerb9 commented May 7, 2020

Good idea with the Shift-Bksp mapping. I've added it as Control-? since I find that a little more readable than '\x7f'.

I've changed Backward and Forward in the .inputrc documentation the other way around. Just to double check, you are saying that on the Model 100, the right arrow key sends Control-] and the left arrow sends Control-?. If so, that means I've got it switched in a lot of other places, too.

@hackerb9
Copy link
Owner

hackerb9 commented May 7, 2020

You conversion table looks interesting. Any chance you can format it to look more like this?:

model100.charset.pdf

@dawidi
Copy link
Author

dawidi commented May 7, 2020

Good idea with the Shift-Bksp mapping. I've added it as Control-? since I find that a little more readable than '\x7f'.

I had tried that syntax first but it didn't work initially (possibly due to a mistake somewhere else), that's why I went with the raw hex value. If Control-? works, that's even better.

I've changed Backward and Forward in the .inputrc documentation the other way around. Just to double check, you are saying that on the Model 100, the right arrow key sends Control-] and the left arrow sends Control-?. If so, that means I've got it switched in a lot of other places, too.

No, key_left=^], key_right=^\ is correct, but in Control-]: forward-char Control-\: backward-char, "forward" and "backward" are reversed. "forward" should be key-right, not key-left.

@dawidi
Copy link
Author

dawidi commented May 7, 2020

Another update: I wrote a small, quick and dirty "random 8-bit charset to UTF-8" converter tonight (my first C program in over a decade, so proud). It has already allowed me to find several code points I got wrong in my conversion list. I'll post it once those are sorted out. I might also swap CR and LF in the conversion list, as the Model 100 only writes CR for line endings.

Currently typing up a test document on the Model 100 using (and describing) all the extended characters...

model100.charset.pdf

I do have the original manuals, but even there some of the characters are not clear enough to tell what they're supposed to be, and apparently there are several mistakes (or charcter ROM differences between the US and Europe?) in the documentation, corrected in pencil by the previous owner. Might do something mimicking the design of the PDF reference table later, yes.

@hackerb9
Copy link
Owner

hackerb9 commented May 8, 2020

I love that you're typing up the document on the Model 100. Are you using the builtin editor or something like vi? I actually created much of README.md on my Tandy 200.

For the implementation of your character set mapping, it may be worthwhile to look into locale "charmaps". For example, try locale -m to see the character maps your computer currently supports.

@dawidi
Copy link
Author

dawidi commented May 8, 2020

I'm using the builtin TEXT, I find it quite pleasant, compared to other editors from the same era.
Never got the hang of vi (I understand how the interface works, but... argh, nope). On the Linux console or through ssh I normally use mcedit (from Midnight Commander) but that doesn't work through TERM, or at least not yet.

the character maps your computer currently supports

I tried this in bash:
( while IFS='\n, ' read -r enc; do echo -ne '\xB1\xB2\xB3 \xF4\xF1\xF9' | iconv -c -f $enc -t UTF-8; echo " from $enc"; done <<< "$( iconv -l )" ) | sort | less
Which tries to interpret the hex sequence B1 B2 B3 F4 F1 F9 in every available character set and turn it into UTF-8. There's over a thousand encodings in the list, but not one of them converts to anything even remotely resembling the ÄÖÜ ├─┤ that the Model 100 would display.

The Tandy Model 100/102/200/Olivetti M10 character set isn't in there, probably none of the inconsistent model-specific sets from the 8-bit era are. I can't even find this set properly documented anywhere on the web, except the low-quality scan PDF of the M100 quick reference you linked to above. The only character set I found anything about was the one used by the TRS-80 Model 1 which merely uses the 128-255 region for block graphics.

@hackerb9
Copy link
Owner

hackerb9 commented May 8, 2020

Good work. How hard does it look to create a new charmap?

I took some photos a while ago of all the characters with their code points. I'll upload those soon since there's no other easily discoverable reference.

If I recall correctly, midnight commander, due to its DOS heritage, never used termcap or terminfo.

@hackerb9
Copy link
Owner

hackerb9 commented May 8, 2020

Btw, for editing from your Model 100, maybe try nano -p foo.txt. I don't think it's ideal since it shows a menu at the bottom using up two rows of text, but it worked for me.

[Update: I just tried Nano again. It uses up two lines at the top of the screen and three at the bottom. On my Tandy 200, that's fine. But on a Model 100, even if you used the -O option to get one more line, you'd only have four lines of text, which seems a bit cramped! ]

@hackerb9
Copy link
Owner

hackerb9 commented May 8, 2020

I've uploaded photos of the character set for the Tandy 200 here: https://github.com/hackerb9/Tandy-Terminfo/tree/master/README.md.d . Let me know if it looks any different from the Model 100. I also made a single compressed file with all the images in it: https://github.com/hackerb9/Tandy-Terminfo/blob/master/README.md.d/td200-charset.gif

@dawidi
Copy link
Author

dawidi commented May 13, 2020

Whoa, I see you've been busy polishing things in the meantime :-D

Charset comparison
I can confirm that the extended charset in your photos looks exactly the same as my Model 100 (European "A" series ROM).

Nano editor
nano -Upx only uses the first line as the titlebar, and is otherwise quite usable indeed as a remote editor. It does choke on extended (>127) characters though, converting them to sequences like ^h^|^} and messing up the display geometry. But it'll work for editing plain ASCII shell scripts and the like.

Charset conversion:
It turns out that the UTF-8 "codepoints I got wrong in my conversion list" were actually correct, but affected by a couple of incorrect bitmasks in my transcoder. It's working fine now.

Here's the conversion program: tandytranscode.c
Compile simply with gcc -o tandytc tandytranscode.c, there are no dependencies other than stdio and string.
Converting to UTF-8 after downloading from the Tandy:
tandytc -d < CHARS.DO > CHARS.utf8.txt
Converting from UTF-8 before uploading back to the Tandy:
tandytc -u < CHARS.utf8.txt > CHARS2.DO

Here's my test file describing the entire visible character set (32-255):
CHARS.DO
Note that this file has LF endings, the original on the Tandy has CR instead.

File Transfer scripts
The above CHARS.DO is so "large" that I ran into another problem. I hope you don't mind if I document it here, it's slightly out of the project's scope - but if you like it, feel free to integrate it:
I ran cat >CHARS.DO and then let TELCOM send the file, but on the Linux side, only the first 4095 bytes were stored even though over 6000 bytes were sent. The reason is that the Tandy uses only CR for line endings, and can overflow the N_TTY_BUF_SIZE in the kernel. The solution is to set the TTY to translate CR to LF (neat, now I don't have to do that in C). And while we're at it, we can also disable echo while TELCOM is sending data, so it doesn't have to scroll. This speeds up uploading files from TELCOM to Linux significantly. I wrapped this into a little stor shell script.
When retrieving files stored on the Linux box back to the Tandy, LF should be converted back to CR, and also, there should be no extraneous writes before or after the file content (I added silent "press any key" steps there), and ideally scrolling should be disabled to speed things up as well - for this, I used the "useless" escape sequences to disable/enable scrolling. This causes the file contents to be flashed only on the last line of the display, which again is much faster. I wrapped this into a handy retr shell script.

@hackerb9
Copy link
Owner

Thanks for verifying the character set, that helps quite a bit!

I've incorporated your suggestions for nano -Opx and stty icrnl into the td200 shell script. I hadn't known about the 4K kernel buffer limit, so I'm glad you figured that out. I believe icrnl can simply be left enabled, no need to ever disable it in your transfer scripts. Btw, I'm not sure that you have to convert LF back to CR when downloading as it seemed to work fine for me.

Good job figuring out what the "useless" escape sequence is good for. I'll document it in the README.md.

Your Unicode conversion program looks quite handy. I'm hoping that, now that there's a mapping, we can do even better by figuring out how to do 8-bit input and output transparently just by setting the charmap for the locale (export LANG=en_US.m100). As you probably noticed I already figured out how to send some of the characters, but that uses a pre-Unicode feature of the VT52.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants