[open] Character handling flaw
I haven't figured out what is special about it yet, but the "Latin small letter A with grave" (this guy: à) works fine in all text fields:
Lua Code:
However, the second half of it gets treated as a space character which results in the code below splitting the string: Lua Code:
The lua website's demo, however, does not have this problem: Lua Code:
I'm not sure how customized your lua implementation is, but do you fix issues like these? |
This is not a bug, but simply an encoding issue. The Lua string functions assume your input sequence is ASCII, but you used UTF8 for your .lua file.
This means the à character in your test corresponds to the two byte sequence "c3 a0" instead of "e0". According to https://www.ascii-code.com/ "c3" is "Latin capital letter A with tilde" and "a0" "Non-breaking space". The game font cannot properly render the first one since it uses utf8 instead of ASCII, so it shows a box instead and the space is handled by gmatch. Try to convert your .lua file to ASCII and it should work as expected although it will break any "real" UTF8 strings you use and the letter will be rendered as a box unless you use a custom font. |
Welcome to the hell of localization. :)
à is at least part of the extended ascii code. But think about russian or japanese players :p http://lua-users.org/wiki/LuaUnicode |
Quote:
TL;DR: When you have all strings in the game in UTF-8, your string handling functions should not operate in LATIN-1. Quote:
Enter ESOLua, modified interpreter. Despite the fact that strings in the ESO API are, for obvious reasons, UTF-8 encoded, string matching functions treat strings as LATIN-1 encoded. Therefore, string.find("\195\160", "%s") returns 2, matching the trailing byte of this two-byte character (in LATIN-1, 160 is a space character). This is BOLLOCKS. Quote:
Quote:
Quote:
Quote:
|
I admit I may not have been completely correct about everything I wrote, but the point still stands that it is not a bug, but just wrong assumptions being made.
Since the pattern classes do not support unicode, one would need to use the appropriate replacements in order to get the expected output: Lua Code:
Quote:
|
Quote:
Quote:
Quote:
|
You changed my mind. They only added UTF-8 support after Chip became our new overlord, so it's likely a remnant from before that time. Would certainly be nice if they could make it so everything works consistently.
|
Tell me if this is correct. You are requesting that we replace the lua string.find with our own UTF-8 compatible pattern matching?
|
Quote:
Code:
static int match_class (int c, int cl) { |
All times are GMT -6. The time now is 05:15 AM. |
vBulletin © 2024, Jelsoft Enterprises Ltd
© 2014 - 2022 MMOUI