diff options
author | Hiltjo Posthuma <hiltjo@codemadness.org> | 2020-06-17 21:35:39 +0200 |
---|---|---|
committer | Hiltjo Posthuma <hiltjo@codemadness.org> | 2020-06-17 21:35:39 +0200 |
commit | 818ec746f4caae453d09368b101c3e841cf39870 (patch) | |
tree | 288b6f27e711cf109d06a0dc00b91b440e014837 /%25252525253fid%25252525253d81067c65ea4dd80e8eb34755a4f50a4a8c7df06b%252525253fid%252525253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%2525253fid%2525253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%25253fid%25253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%253fid%253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%3fid%3df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4?id=818ec746f4caae453d09368b101c3e841cf39870 | |
parent | 9ba7ecf7b15ec2986c6142036706aa353b249ef9 (diff) |
fix unicode glitch in DCS strings, patch by Tim Allen
Reported on the mailinglist:
"
I discovered recently that if an application running inside st tries to
send a DCS string, subsequent Unicode characters get messed up. For
example, consider the following test-case:
printf '\303\277\033P\033\\\303\277'
...where:
- \303\277 is the UTF-8 encoding of U+00FF LATIN SMALL LETTER Y WITH
DIAERESIS (ÿ).
- \033P is ESC P, the token that begins a DCS string.
- \033\\ is ESC \, a token that ends a DCS string.
- \303\277 is the same ÿ character again.
If I run the above command in a VTE-based terminal, or xterm, or
QTerminal, or pterm (PuTTY), I get the output:
ÿÿ
...which is to say, the empty DCS string is ignored. However, if I run
that command inside st (as of commit 9ba7ecf), I get:
ÿÿ
...where those last two characters are \303\277 interpreted as ISO8859-1
characters, instead of UTF-8.
I spent some time tracing through the state machines in st.c, and so far
as I can tell, this is how it works currently:
- ESC P sets the "ESC_DCS" and "ESC_STR" flags, indicating that
incoming bytes should be collected into the strescseq buffer, rather
than being interpreted.
- ESC \ sets the "ESC_STR_END" flag (when ESC is received), and then
calls strhandle() (when \ is received) to interpret the collected
bytes.
- If the collected bytes begin with 'P' (i.e. if this was a DCS
string) strhandle() sets the "ESC_DCS" flag again, confusing the
state machine.
If my understanding is correct, fixing the problem should be as easy as
removing the line that sets ESC_DCS from strhandle():
diff --git a/st.c b/st.c
index ef8abd5..b5b805a 100644
--- a/st.c
+++ b/st.c
@@ -1897,7 +1897,6 @@ strhandle(void)
xsettitle(strescseq.args[0]);
return;
case 'P': /* DCS -- Device Control String */
- term.mode |= ESC_DCS;
case '_': /* APC -- Application Program Command */
case '^': /* PM -- Privacy Message */
return;
I've tried the above patch and it fixes my problem, but I don't know if
it introduces any others.
"
Diffstat (limited to '%25252525253fid%25252525253d81067c65ea4dd80e8eb34755a4f50a4a8c7df06b%252525253fid%252525253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%2525253fid%2525253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%25253fid%25253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%253fid%253df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4%3fid%3df74a9df6e1fc88eebe6d673d888b61fd83cf6fc4?id=818ec746f4caae453d09368b101c3e841cf39870')
0 files changed, 0 insertions, 0 deletions