View Single Post
05/31/14, 05:51 PM   #1
Sharlikran
 
Sharlikran's Avatar
AddOn Author - Click to view addons
Join Date: Apr 2014
Posts: 655
Unicode in a map name

Lua Code:
  1. ["bleakrock/bleakrockvillage_base"] = {"Bleakrock Village", "Ödfels^N,in", "\xd6dfels^N,auf", "\xEF\xBF\xBDdfels^N,auf", "village de Morneroc^md",},
I have some map names with escape sequences in them. One is a BOM sequence.
Lua Code:
  1. function Harvest.GetNewMapName(mapName)
  2.     local result = nil
  3.     for newMapName, translations in pairs(Harvest.mapSystem) do
  4.         if Harvest.contains(translations, mapName) then
  5.             if result then
  6.                 return nil --there are more than one possible maps, skip to prevent wrong data
  7.             else
  8.                 result = newMapName
  9.             end
  10.         end
  11.     end
  12.     return result
  13. end
That function is used to look for possible results. It will find "Ödfels^N,in" but not "\xd6dfels^N,auf" and "\xEF\xBF\xBDdfels^N,auf" because I think the string is being treated as a literal match, not as a unicode string.

Lua Code:
  1. function Utf8to32(utf8str)
  2.     assert(type(utf8str) == "string")
  3.     local res, seq, val = {}, 0, nil
  4.     for i = 1, #utf8str do
  5.         local c = string.byte(utf8str, i)
  6.         if seq == 0 then
  7.             table.insert(res, val)
  8.             seq = c < 0x80 and 1 or c < 0xE0 and 2 or c < 0xF0 and 3 or
  9.                   c < 0xF8 and 4 or --c < 0xFC and 5 or c < 0xFE and 6 or
  10.                   error("invalid UTF-8 character sequence")
  11.             val = bit32.band(c, 2^(8-seq) - 1)
  12.         else
  13.             val = bit32.bor(bit32.lshift(val, 6), bit32.band(c, 0x3F))
  14.         end
  15.         seq = seq - 1
  16.     end
  17.     table.insert(res, val)
  18.     table.insert(res, 0)
  19.     return res
  20. end
I googled that but i'm not really trying to check for valid UTF-8 chars, and amke sure there aren't any errors. What I am wondering is what simple command do I use to take a string like "\xEF\xBF\xBDdfels^N,auf" and match it with "�dfels^N,auf". Where "�" is the BOM sequence.

When "\xEF\xBF\xBDdfels^N,auf" == "�dfels^N,auf" then then the result would be true.
Lua Code:
  1. string.escape(string1, string2)
Is there something like above where string1 = "\xEF\xBF\xBDdfels^N,auf" and string2 = "�dfels^N,auf" so then it's true?

Last edited by Sharlikran : 05/31/14 at 05:55 PM.
  Reply With Quote