ESOUI - View Single Post

Sharlikran · 07/21/20, 01:42 PM

I am not sure what kind of encoding you should use either. Czech is code page 1250 according to xEdit docs and English is 1252. However there is no way to know what Zenimax uses compared to Skyrim or Fallout 4 plugin localization strings. Normally for Skyrim and Fallout 4 we do not recommend using UTF8 because it is not code page 1252 it is its own entity.

Using UTF8 might be the best for ESO although I am not sure.

If you have windows 10 be careful that you do not have the Beta UTF8 checked in languages from the control panel. That will most likely use code page 65001 and that is not only unique to Windows 10 but may not be compatible with some programs.

People need to remember that language is synonymous with code page. Using Unicode is like saying what kind of language do you speak. When we say what kind of language do you speak to a person the answer is English. German, or Czech. To a computer it is what kind of Unicode are you using and the answer is UTF8, code page 1252, or code page 1250.

I recommend figuring out what English uses. The reason for that is the same concept for Skyrim and Fallout 4 plugin localization strings. Since users set Skyrim and Fallout 4 in English then the game wants all strings to use the encoding for English which is code page 1252.

It only makes sense to use the same thing here. Whatever Zenimax uses for English is what your unofficial translation should use because users will have the game set to English but your text will be in the Czech language. So I would say UTF8 or 1252. I just can't tell you which.

If you try UTF8 make sure you use a program that knows what UTF8 without BOM is. You do not want BOM because you do not need the extra chars to signal the type of encoding.

The UTF-8 BOM is a sequence of bytes at the start of a text stream ( 0xEF, 0xBB, 0xBF ) that allows the reader to more reliably guess a file as being encoded in UTF-8. Normally, the BOM is used to signal the endianness of an encoding, but since endianness is irrelevant to UTF-8, the BOM is unnecessary