Bad token names - advanced topic

wolph42 · Post by **wolph42** » Wed Mar 30, 2016 5:44 am

For the more experienced macro writers:

I would like to categorize 'bad characters in token names' ranging from really, reasonable, possibly bad. 'Bad' in the sense that if they are used in a macro, they can screw things up.

Reason

I already have a diagnose macro that checks for these things, I'm currently building a general 'fix token names' macro to substitute all these but I want to give the user some sense of what is really necessary to fix and what not.

List of characters I check for:

Code: Select all

&+:;=?@#|'<>.-, trailingSpace leadingSpace

code

Code: Select all

<!-- check the tokens for any illegal character -->
[h:tokList    = getTokenNames("json", '{layer:["TOKEN","GM","OBJECT","BACKGROUND"]}')]
<!-- in another routine the tokens with a , in the name are removed first -->
[h:hasWrongTokName ="[]"]
[h:regResult    = strfind(tokList,"[^,]*([&+:;=?@#|'<>.-]+)[^,]*")]
[h:numWrong        = min(1000,getFindCount(regResult))]
[h,if(numWrong), CODE:{
    [count(numWrong): hasWrongTokName = json.append(hasWrongTokName, getGroup(regResult, roll.count+1,0))]
}]
[r:json.toList(hasWrongTokName)]

spaces are done seperately

I've tried adding:

Code: Select all

$\/^~

but these all screw up the regex check.

3 questions:
1. do you know how I can add one or more of the 'tried to add but failed' characters to the entire list
2. did I miss any (important) characters
3. how would you categorize the characters over the three groups?

My take:

Really
, : '

Reasonably
& ? @ ; trailing/leading space

Possibly
+ = # | < > . -

metatheurgist · Post by **metatheurgist** » Wed Mar 30, 2016 6:16 am

What do the 3 groups mean?

Post by **aliasmask** » Wed Mar 30, 2016 6:59 am

Unicode characters could possible screw things up too. You can escape those special characters except for $. It needs an MT escape and a regex escape. I have no idea how many \'s you need before it. I'm thinking 3. I would also put it inside []'s.

wolph42 · Post by **wolph42** » Wed Mar 30, 2016 7:25 am

metatheurgist wrote:What do the 3 groups mean?

if you put a "," in your token name e.g. "Troll, Giant" and you e.g. use [getTokenNames(",")] it will give you a heaven of bugs in the code. Hence 'really bad' as that function is used quite often.
Overall the three groups are a bit arbitrary/experienced-macro-writer-gut-feeling and I want a bit of a broader opinion, hence the question.:

aliasmask wrote:Unicode characters could possible screw things up too. You can escape those special characters except for $. It needs an MT escape and a regex escape. I have no idea how many \'s you need before it. I'm thinking 3. I would also put it inside []'s.

Might I suggest that you copy paste my code in the OP and you test it with your own suggestion? (I already did that before posting here and there's no way I can get that code running with either of those characters in there (escaped, double escaped etc).
IRC unicode chars also screw up the macro itself (so it won't run) but I can always try, any typical suggestions?
edit: I found out that ANYTHING placed AFTER the dash (-) in the above regexstr renders an error, hence the issue that most stuff I tried simply didn't work. I'll try again. Note that \\\$ works (when NOT places after the dash)

usefull website: http://blog.codinghorror.com/ascii-pron ... ogrammers/

edit: as for unicode: using Grave and Accute in the regex does not fail the macro, but the regex no longer works (returns all tokens):

Code: Select all

[h:regResult    = strfind(tokListC,"[^,]*([&+:;=?@#|'<>.-  ̀ ́ ]+)[^,]*")]
<!-- had to add some spaces or the webbrowser woulnt eat them as well. -->

wolph42 · Post by **wolph42** » Thu Mar 31, 2016 3:27 am

bumb, so anyone have a take on either missed characters and/or the grouping of the characters as I've suggested?

RPTools.net

Bad token names - advanced topic

Bad token names - advanced topic

Re: Bad token names - advanced topic

Re: Bad token names - advanced topic

Re: Bad token names - advanced topic

Re: Bad token names - advanced topic