Thread Tools Display Modes
03/12/15, 10:43 PM   #1
Weolo
AddOn Author - Click to view addons
Join Date: Apr 2014
Posts: 79
Question Replace with GSUB

GetString(SI_SMITHING_RESEARCH_IN_PROGRESS)
Returns "Researching..." in en language.
I want a small variation of this but rather than translate this text into other languages i would like to replace the 3 dots with colon, space but I can't get the regular expression right, could someone help out?

Lua Code:
  1. local test1 = string.gsub(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS),'%.%.%.',': ')
  2. local test2 = GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):gsub('%.%.%.',': ')
  Reply With Quote
03/12/15, 11:42 PM   #2
Sasky
AddOn Author - Click to view addons
Join Date: Apr 2014
Posts: 231
Something to do with the string returned from GetTranslateString()
Those aren't normal periods.

Did some poking around from the /script prompt:
Lua Code:
  1. d(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):sub(-1,-1))
  2. > x
  3.  
  4. d(string.byte("."))
  5. 46
  6.  
  7. d(string.byte(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):sub(-3,-3)))
  8. 226
  9.  
  10. d(string.byte(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):sub(-2,-2)))
  11. 128
  12.  
  13. d(string.byte(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):sub(-1,-1)))
  14. 166

Which is rather disgusting.

To replace it, you need to do:
Lua Code:
  1. GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):gsub(string.char(226,128,166),': ')


Alternatively, considering the ": " itself is rather dependent about the ellipsis being at the end, you could just do something like:

Lua Code:
  1. GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):sub(1,-4) .. ": "

Last edited by Sasky : 03/12/15 at 11:48 PM. Reason: add proper gsub
  Reply With Quote
03/12/15, 11:53 PM   #3
circonian
AddOn Author - Click to view addons
Join Date: May 2014
Posts: 613
Originally Posted by Weolo View Post
GetString(SI_SMITHING_RESEARCH_IN_PROGRESS)
Returns "Researching..." in en language.
I want a small variation of this but rather than translate this text into other languages i would like to replace the 3 dots with colon, space but I can't get the regular expression right, could someone help out?

Lua Code:
  1. local test1 = string.gsub(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS),'%.%.%.',': ')
  2. local test2 = GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):gsub('%.%.%.',': ')
Beats me, like Sasky said they ain't normal periods.
You use string.match to grab everything up to them though, then just add whatever you want to it.

EDIT: Merlights right about ^[A-z], It should have been:
Lua Code:
  1. string.match(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS), "^[a-zA-Z]+")..": "
  2.  
  3. -- or:
  4. string.match(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS), "^%a+")..": "

But I also just realized you said "rather than translate", GetString(SI_SMITHING_RESEARCH_IN_PROGRESS) will automatically translate it. So it really sounds like you don't want to do any of this. If you don't want it translated just do:
Lua Code:
  1. local researchString = "Researching: "

Last edited by circonian : 03/13/15 at 03:23 PM.
  Reply With Quote
03/13/15, 06:55 AM   #4
votan
 
votan's Avatar
AddOn Author - Click to view addons
Join Date: Oct 2014
Posts: 577
Just to explain this strange characters:
The byte sequence 226,128,166 is the UTF8 encoded data for the UNICODE char U+2026 "horizontal ellipse".
LUA (including all string methods) handles strings as ANSI (one byte char), while the client prints UTF8 encoded strings correctly in UNICODE.
This makes using regex patterns a bit tricky sometimes.

Last edited by votan : 03/13/15 at 07:15 AM.
  Reply With Quote
03/13/15, 07:54 AM   #5
merlight
AddOn Author - Click to view addons
Join Date: Jul 2014
Posts: 671
Originally Posted by circonian View Post
[A-z]
Never, ever, in any language, pattern matching engine or whatever, use this character range. In many regular expression guides, this is given as an example of how the range operator in character classes should not be used. It matches ASCII letters, but also brackets '[' and ']', backslash '\\', caret '^', underscore '_' and backtick '`'.

Last edited by merlight : 03/13/15 at 07:59 AM.
  Reply With Quote
03/13/15, 08:26 AM   #6
TheDepe
Join Date: May 2014
Posts: 1
Originally Posted by circonian View Post
Beats me, like Sasky said they ain't normal periods.
You use string.match to grab everything up to them though, then just add whatever you want to it.
Lua Code:
  1. string.match(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS), "^[A-z]+")..": "
Lua Code:
  1. local test1 = string.gsub(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS),'[.]+',': ')
Or this way..
  Reply With Quote
03/13/15, 09:26 AM   #7
merlight
AddOn Author - Click to view addons
Join Date: Jul 2014
Posts: 671
Originally Posted by TheDepe View Post
Lua Code:
  1. local test1 = string.gsub(GetString(SI_SMITHING_RESEARCH_IN_PROGRESS),'[.]+',': ')
Or this way..
That replaces a sequence of periods (ASCII 46), not Unicode ellipsis, which this topic is about.

Anyway, how would you replace "foo"?
Lua Code:
  1. str = str:gsub("foo", "bar")

So, do the same with what you want to replace:
Lua Code:
  1. str = str:gsub("…", ": ") -- ellipsis, not three periods!

Or, if you're worried about having non-ASCII characters in your code:
Lua Code:
  1. str = str:gsub("\226\128\166", ": ")
  2. -- highlighting fails

NB: I will probably never stop wondering why does Lua have to differ in such stupid ways, like using decimal base where everyone else uses octal.
  Reply With Quote
03/13/15, 01:31 PM   #8
Weolo
AddOn Author - Click to view addons
Join Date: Apr 2014
Posts: 79
Wow I never expected it to be a combination of all those things.
I appreciate you all taking a look... interesting stuff.

I will keep the unicode replacement in mind for future things but for this problem I will keep it simple and go with the substring to rip off the last 3 characters
Lua Code:
  1. GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):sub(1,-4) .. ": "
  Reply With Quote
03/13/15, 02:16 PM   #9
merlight
AddOn Author - Click to view addons
Join Date: Jul 2014
Posts: 671
Originally Posted by Weolo View Post
I will keep the unicode replacement in mind for future things but for this problem I will keep it simple and go with the substring to rip off the last 3 characters
Lua Code:
  1. GetString(SI_SMITHING_RESEARCH_IN_PROGRESS):sub(1,-4) .. ": "
That's going to bite you sooner or later. Cutting UTF-8 strings at arbitrary byte positions is never a good idea
  Reply With Quote
03/13/15, 03:26 PM   #10
circonian
AddOn Author - Click to view addons
Join Date: May 2014
Posts: 613
Originally Posted by merlight View Post
^[A-z]
Never, ever, in any language, pattern matching engine or whatever, use this character range. In many regular expression guides, this is given as an example of how the range operator in character classes should not be used. It matches ASCII letters, but also brackets '[' and ']', backslash '\\', caret '^', underscore '_' and backtick '`'.
Eew, your right, I corrected my post.
  Reply With Quote
03/13/15, 11:24 PM   #11
Sasky
AddOn Author - Click to view addons
Join Date: Apr 2014
Posts: 231
Originally Posted by merlight View Post
That's going to bite you sooner or later. Cutting UTF-8 strings at arbitrary byte positions is never a good idea
Of course, the 'proper' solution would be to make fresh translator strings. Manipulating translated or formatted strings can bite you sooner or later.

The :gsub of the 3-byte sequence anchored to the end of string is probably the most robust:
Code:
str = str:gsub("\226\128\166$", ": ")
--When highlighting fails, kill the highlighting
It falls back to the original string if something goes wrong unlike the string split.
  Reply With Quote
03/14/15, 05:24 AM   #12
Weolo
AddOn Author - Click to view addons
Join Date: Apr 2014
Posts: 79
Fair enough, always happy to take on other peoples advice.
I will use the robust solution from now on.
Thanks!
  Reply With Quote

ESOUI » Developer Discussions » Lua/XML Help » Replace with GSUB


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off