Problem with encode(), forms and umlauten

Discuss macro implementations, ask for macro help (to share your creations, see User Creations, probably either Campaign Frameworks or Drop-in Resources).

Moderators: dorpond, trevor, Azhrei, giliath, jay, Mr.Ice

Odonel
Cave Troll
Posts: 26
Joined: Wed Dec 08, 2010 5:50 am
Location: Germany

Problem with encode(), forms and umlauten

Post by Odonel »

I don't understand what is happening and hope, that I can present my problem good enough for you to understand what is happening :D

The macro: The user is presented in a dialog with the content of a string list. Every entry is paired up with a checkbox and the user chooses which entry he wants to see on the character sheet. The dialog forwards the formulardata and another macro (well it is the same macro with different parameters) combines the choosen string list entrys to one string which is encoded via encode() and stored in another property that is finally displayed in the character sheet.

The problem: I use Ubuntu 9.04 as my development system. There was no problem at all, but...
If I run the same macro under Windows 7 I get little squares instead of my umlaute (ä, ö, ü...). If I try to write [r: encode("ä")] I get an error message as expected - MapTool always send me errors, when I tried to write umlaute in my macros. Normally I use the HTML writing. Unfortunately the HTML writing is already translated, when the string list entrys are displayed in the form. Hence the form forwards the umlaute as one character that MapTool shouldn't want to work with. Apparently this doesn't kill my macro, when the umlaute are in a JSON-Object and are just replaced by the little squares.

What I don't understand is why my Ubuntu enviroment doesn't have this problem. It even encodes the umlaute to the appropriate writing, but when I use an umlaut in my macros it gives me the same error message as under Windows.

My question: I would greatly appreciate every advice how to bypass the umlaute and get my macro to work as intended. As a bonus I'm eager to know why Ubuntu doesn't create this problem at all.

[spoiler=My code]String list property example

Code: Select all

name: Vorteile
content: Glück, Gutaussehend

The macro in question (translated :>)

Code: Select all

<!-- Checks whether the select dialog should be shown or evaluated. -->
[if(json.type(macro.args)!="OBJECT"), code:
{
   <!-- If the macro is called from the character sheet macro.args whould be e.g. ["Vorteile"] -->
   [h: ChoosenProperty = macro.args]

   [dialog("selectPropertyContent", "width=250; height=500; temporary=1; input=1;"): {
      <html>
            <head>
               <title>Select</title>
            </head>
            <body>
               <form name="propertyForm" action="macro://[email protected]:macros/all/Impersonated?" method="json">
                  [h: Propertycontent = getProperty(ChoosenProperty)]
            
                  <p>
                     [count(listCount(Propertycontent), ""):
                        '<input type="checkbox" name="'+roll.count+'" value="'+listGet(Propertycontent, roll.count)+'" checked="checked">'+listGet(Propertycontent, roll.count)+'<br>'
                     ]
                  </p>
            
                  <input type="submit" value="Change" name="[r: ChoosenProperty]">
               </form>
            </body>
      </html>
   }]
};
{
   [h: mParameter = json.toStrProp(macro.args)]
   [h: mNumberOfParameters = countStrProp(mParameter)]
   [h: mChoosenProperty = indexKeyStrProp(mParameter, mNumberOfParameters-1)]

   [h: output = ""]
   [h, count(mNumberOfParameters-1): output = listAppend(output, indexValueStrProp(mParameter, roll.count))]
   [h: output = encode(output)]
   [h: setProperty(mChoosenProperty+"_Displaycache", output, currentToken())]
}]
[/spoiler]

User avatar
Azhrei
Site Admin
Posts: 12058
Joined: Mon Jun 12, 2006 1:20 pm
Location: Tampa, FL

Re: Problem with encode(), forms and umlauten

Post by Azhrei »

Which version of MapTool are you using, and which version of Java on each of your platforms?

(I don't expect the Java version to be an issue, but it's still useful to have.)

The umlaut character (and similar) cannot be read by the parser (i.e. they can't be entered into the chat window or entered into a macro command window such that they will work). But other input fields (such as text fields in an input or button labels for macros) will work fine because they are not processed by the parser -- they are handled directly by Java without any additional processing.

An umlaut inside a token property should work fine IF you're running a recent enough version of MT that writes out the campaign and token files using UTF-8 instead of the user's default character encoding. Ubuntu likely defaults to UTF-8 (i.e. that IS the default char encoding), but Windows probably doesn't. But it's impossible to continue probing the problem without knowing which version of MT you're using...

Odonel
Cave Troll
Posts: 26
Joined: Wed Dec 08, 2010 5:50 am
Location: Germany

Re: Problem with encode(), forms and umlauten

Post by Odonel »

Azhrei wrote:Which version of MapTool are you using, and which version of Java on each of your platforms?

Ubuntu is running java 1.6.0_24 and Windows is running version 1.6.0_26.
I've used MapTool 1.3.b80 for developing but tried 1.3.b86 now, after you suggested the MapTool version could be the problem, but unfortunately the problem remains.

Odonel
Cave Troll
Posts: 26
Joined: Wed Dec 08, 2010 5:50 am
Location: Germany

Re: Problem with encode(), forms and umlauten

Post by Odonel »

After half a year I have come back to programming macros for MapTool. My problems with the umlaute still exists. How can I tell mapTool 1.3.b86 in windows that it should use UTF-8?
(I'm using java 1.6.0_30-b12 now.)

User avatar
aliasmask
Deity
Posts: 8624
Joined: Tue Nov 10, 2009 6:11 pm
Location: Bay Area

Re: Problem with encode(), forms and umlauten

Post by aliasmask »

If I recall, at least for my windows xp system, that you can not pass unicode characters in a form. I can see how that could be bad for language support and custom forms.

User avatar
Azhrei
Site Admin
Posts: 12058
Joined: Mon Jun 12, 2006 1:20 pm
Location: Tampa, FL

Re: Problem with encode(), forms and umlauten

Post by Azhrei »

The parser will not accept non-ASCII characters.

The workaround is to use HTML entities instead of Unicode.

(This limitation is for the macro parser. It may be possible to use Unicode in places where the parser is not involved, such as table data or campaign properties.)

User avatar
patnodewf
Cave Troll
Posts: 97
Joined: Sun Jan 15, 2012 3:44 am

Re: Problem with encode(), forms and umlauten

Post by patnodewf »

For the umlaute, could you just follow up the normal character with an "e" to get around it? (ä becomes "ae" etc.) phonetically, they should be similar at least.
My form-fillable PDF Character Sheet for Pathfinder can be found here.

Odonel
Cave Troll
Posts: 26
Joined: Wed Dec 08, 2010 5:50 am
Location: Germany

Re: Problem with encode(), forms and umlauten

Post by Odonel »

patnodewf wrote:For the umlaute, could you just follow up the normal character with an "e" to get around it? (ä becomes "ae" etc.) phonetically, they should be similar at least.

That will be my workaround. People can understand it but it just does not look nice. Since it works just fine with Linux, I hoped that I somehow could tell Windows to use UTF-8 instead of ASCII.

Azhrei wrote:The parser will not accept non-ASCII characters.
The workaround is to use HTML entities instead of Unicode.

(This limitation is for the macro parser. It may be possible to use Unicode in places where the parser is not involved, such as table data or campaign properties.)

The parser does not seem to be involved otherwise it would not work with Ubuntu, wouldn't it? I tried to use HTML entities but when they are read from the token property and written into the form they are translated into the umlaute. The content of the form is then used elsewhere and since the HTML entities where already translated now the umlaute are used and not understood by the following processes.

User avatar
Azhrei
Site Admin
Posts: 12058
Joined: Mon Jun 12, 2006 1:20 pm
Location: Tampa, FL

Re: Problem with encode(), forms and umlauten

Post by Azhrei »

Hm, then there must be something else going on. The MTscript parser is the one with the limitation. The parser is used for anything typed into the chat window and anything executed as a macro or via [wfunc]eval[/wfunc]. That means input fields (such as the results of a form submission or data typed into an [wfunc]input[/wfunc] field or the label on a button) should be okay.

If you're finding that a form submission isn't working, maybe you're trying to [wfunc]eval[/wfunc] it?

User avatar
jfrazierjr
Deity
Posts: 5176
Joined: Tue Sep 11, 2007 7:31 pm

Re: Problem with encode(), forms and umlauten

Post by jfrazierjr »

Odonel wrote:
patnodewf wrote:For the umlaute, could you just follow up the normal character with an "e" to get around it? (ä becomes "ae" etc.) phonetically, they should be similar at least.

That will be my workaround. People can understand it but it just does not look nice. Since it works just fine with Linux, I hoped that I somehow could tell Windows to use UTF-8 instead of ASCII.

Azhrei wrote:The parser will not accept non-ASCII characters.
The workaround is to use HTML entities instead of Unicode.

(This limitation is for the macro parser. It may be possible to use Unicode in places where the parser is not involved, such as table data or campaign properties.)

The parser does not seem to be involved otherwise it would not work with Ubuntu, wouldn't it? I tried to use HTML entities but when they are read from the token property and written into the form they are translated into the umlaute. The content of the form is then used elsewhere and since the HTML entities where already translated now the umlaute are used and not understood by the following processes.


[spoiler=Technical stuff here]Ummm.... If it works in Ubuntu, but not windows, then the issue is due to Microsoft's default code page(encoding) being windows-1252(for US anyway and perhaps only up to a certain version of "windows". ie, Windows 7 may use UTF-8 natively .. I don't know and don't care... ALSO NOTE: the windows-1252 codepage/encoding IS NOT the same as iso-8859-1 on US English versions of Windows!). This is so that there is some "native" support for the STUPID MS Word "helpful" characters such as left quote, right quote, ellipse, etc without having to do every thing in unicode.

In THEORY, you could try adding:
-Dfile.encoding="UTF-8"

to the batch file prior to the memory/stack arguments.


This may or may not work... I have seen various arguments on the interwebs that both say it certainly works and other saying "don't do that darnit".... I have not looked, but I guess it also depends on how MapTool and ANTLR 2.7 work. I have not idea, but I suspect that antlr internally opens a up character stream from the passed in string text MapTool sends it and more than likely ANTLR just uses the computer's default encoding by querying this system property or something else(hopefully, not hard coded.[/spoiler]
I save all my Campaign Files to DropBox. Not only can I access a campaign file from pretty much any OS that will run Maptool(Win,OSX, linux), but each file is versioned, so if something goes crazy wild, I can always roll back to a previous version of the same file.

Get your Dropbox 2GB via my referral link, and as a bonus, I get an extra 250 MB of space. Even if you don't don't use my link, I still enthusiastically recommend Dropbox..

Odonel
Cave Troll
Posts: 26
Joined: Wed Dec 08, 2010 5:50 am
Location: Germany

Re: Problem with encode(), forms and umlauten

Post by Odonel »

jfrazierjr wrote:In THEORY, you could try adding:
-Dfile.encoding="UTF-8"

to the batch file prior to the memory/stack arguments.

This may or may not work... I have seen various arguments on the interwebs that both say it certainly works and other saying "don't do that darnit".... I have not looked, but I guess it also depends on how MapTool and ANTLR 2.7 work. I have not idea, but I suspect that antlr internally opens a up character stream from the passed in string text MapTool sends it and more than likely ANTLR just uses the computer's default encoding by querying this system property or something else(hopefully, not hard coded.


It worked! Thank you very much! Thanks to everyone else for their efforts, too!

I will take a look at the internet about why one should not use this method and then decide if the drawbacks are so big, that I follow the characters up with e after all.

User avatar
aliasmask
Deity
Posts: 8624
Joined: Tue Nov 10, 2009 6:11 pm
Location: Bay Area

Re: Problem with encode(), forms and umlauten

Post by aliasmask »

Anyway we can add this to the MapToolLauncher process? It would be nice if this was a default setting. I don't know what kind, if any drawbacks there would be.

User avatar
Azhrei
Site Admin
Posts: 12058
Joined: Mon Jun 12, 2006 1:20 pm
Location: Tampa, FL

Re: Problem with encode(), forms and umlauten

Post by Azhrei »

It might actually be nice for the launcher to include some user-friendly "helpful" fields. Things like checkboxes for various -D option, a checkbox for java.exe vs. javaw.exe, and so forth.

But it's written in C (on Windows! Ugh!) so I'm certainly not touching it. Should anyone else be willing to take a stab at it, the source is available on SourceForge, of course.

User avatar
patoace
Dragon
Posts: 313
Joined: Mon Sep 24, 2007 6:10 pm
Location: Rancagua - Chile

Re: Problem with encode(), forms and umlauten

Post by patoace »

Wow!! That's fantastic. I see a new version of my framework coming.

Also, in windows the encoding can be set with an environment variable. This

Code: Select all

JAVA_TOOL_OPTIONS = Dfile.encoding=UTF8

User avatar
patoace
Dragon
Posts: 313
Joined: Mon Sep 24, 2007 6:10 pm
Location: Rancagua - Chile

Re: Problem with encode(), forms and umlauten

Post by patoace »

:D

I kept investigating this, and find out that you can also set this in the mt.cfg file, adding the line

Code: Select all

JVM=javaw -Dfile.encoding="UTF-8"


May be this can be set by default.

Post Reply

Return to “Macros”