Converting characters to bytes...

The RPTools applications are written in Java. If you're interested in contributing to any project here by submitting patches to the source code, this is the forum to ask questions about how to do so. Please put the two-letter tool name abbreviation in your thread Title. To enter this group, go to the Usergroups page of your User Control Panel and join the Java Developer group.

Moderators: dorpond, trevor, Azhrei

Post Reply
User avatar
Azhrei
Site Admin
Posts: 12086
Joined: Mon Jun 12, 2006 1:20 pm
Location: Tampa, FL

Converting characters to bytes...

Post by Azhrei »

I've been working on the locale support in MapTool for loading/saving data. I'm confusing myself a bit when it comes to the zip file output, so this is my stream-of-consciousness rambling about what data I have and what I'm trying to produce. Maybe it'll help me clarify what I need to do, or maybe it'll help someone else in the future.

I have an InputStreamReader (which I treat as a Reader) which is connected to a text file (via FileInputStream) and the UTF-8 character set (passed to the InputStreamReader constructor). So any time I read from the Reader, the text will be interpreted as UTF-8 and life is good. :)

Now I want to copy that data stream to a java.util.zip.ZipEntry inside the campaign file. I do this by creating ZipOutputStream and connecting it to a FileOutputStream. But how should I copy the data?

If I read() from the Reader I can store data into a char[]. But I can only write a byte[] to the OutputStream. What is the proper way to convert characters into bytes? Ideally I could just grab the bottom 8 bits, but that would screw up Unicode characters stored in the char[]. So should I be using a UTF-8 encoder as the data is written to the OutputStream? That seems counter-intuitive since I'm trying to write "raw" data. Although the rule is that characters must be decoded when reading and encoded when writing; but should that apply to the ZipOutputStream as well?

Now that I've typed all that out (!) I think I should be encoding on the way out again... I'll continue from there. :)

User avatar
whited
Cave Troll
Posts: 45
Joined: Sun Aug 29, 2010 8:58 pm
Location: Seattle, WA

Re: Converting characters to bytes...

Post by whited »

Sorry if you've already figured this stuff out - I realize it's been a month since this post was written...

It sounds like you are looking for OutputStreamWriter to write characters and get them encoded properly on the wrapped OutputStream.

Some other random UTF-8 encoding things would be:

If you have a string you need to get UTF8 bytes for, you can use String.getBytes("UTF-8").

I see the MT project is using JakartaCommons - JakartaCommons Codec library has StringUtils.getBytesUtf8(String) which appears to be a convenience method for the previous link.

JakartaCommons IO has a FileWriterWithEncoding class, but that writes directly to a file, so you can't wrap your ZipOutputStream with it.

Hope that helps!

User avatar
Azhrei
Site Admin
Posts: 12086
Joined: Mon Jun 12, 2006 1:20 pm
Location: Tampa, FL

Re: Converting characters to bytes...

Post by Azhrei »

Yep, I am familiar with the idea between retrieving bytes in a particular encoding when starting from a String. The issue was more about storing data into a ZIP file using ZipOutputStream. When I'm writing text, it should be converted and when I'm writing binary data it shouldn't. But internally MT never considered the need for localization so the code used to read/write the ZIP files always used {Input|Output}Stream and never {Reader|Writer}.

In any case, that's all been fixed now and b74 will work properly. :)

I had hoped to here from another forum member about a patch of his, but it doesn't seem like it's coming, so b74 will either be out tonight or late Wednesday night (I've got a game tomorrow night so I'm busy 8)).

Post Reply

Return to “Java Programming Info”