Translating from XML to OpenStep and back

General discussion for players of Oolite.

Moderators: winston, another_commander

Post Reply
Chaky
Deadly
Deadly
Posts: 213
Joined: Sat Aug 15, 2009 6:15 am

Translating from XML to OpenStep and back

Post by Chaky »

This started as a reply in one of the treads, but went off topic big time.

That's why I'm straiting a new thread.


My issue is this:
I'm making a translator code that will automatically translate plists XML to ASCII and ASCII to XML.

It is pretty much done, but there are still some details that I'm not sure about.

Mostly, the problematic files to translate are "missiontext.plist"s. I've came across plists in ASCII that use real CR/LFs as part of text. I assume that in XML, those CR/LFs must be translated to "\n"s, but still I came across XML plists that use (maybe by error) CR/LFs.

Another problem is translating quotes in XML. I assume that correct translation would be \" for each ", but I need some confirmation on that.

Also, "("s and ")"s in XML are completely undocumented.. are "\(" and "\)" correct translation? And What about "{"s and "}"s?


There is another dilemma I'm having: whether or not to put quote marks on keys and values that contain "_"s and "-"s (negative-value numbers excluded there). I've seen conflicting examples in current OXPs, and don't have the time nor willpower to test-drive each translation variant.

My goal is to make translated plists platform-proof, so any rules that are platform-specific would be welcome. (I know about Mac's sensitivity on comma-ending arrays)
Last edited by Chaky on Mon Oct 05, 2009 9:19 pm, edited 2 times in total.
pmw57
---- E L I T E ----
---- E L I T E ----
Posts: 389
Joined: Sat Sep 26, 2009 2:14 pm
Location: Christchurch, New Zealand

Post by pmw57 »

It seems that you are wanting to convert from the XML property list to the NeXTSTEP property list.

The NeXTSTEP property list came before the XML property list because only the latter can support non-ascii values and NSVALUE objects (number/ boolean)

I applaud you for wanting to develop the back-conversion, but if you want to save space, Apple have also created a binary format, in which the XML property lists are stored.

http://en.wikipedia.org/wiki/Property_list
A trumble a day keeps the doctor away, and the tax man;
even the Grim Reaper keeps his distance.
-- Paul Wilkins
Chaky
Deadly
Deadly
Posts: 213
Joined: Sat Aug 15, 2009 6:15 am

Post by Chaky »

Honestly, I don't see what possible use I can have with binary-formatted plists, if they are not parsed by Oolite (and I'm on win-PC platform), but thanks for the link. There was a plist editor for windows there.
pmw57
---- E L I T E ----
---- E L I T E ----
Posts: 389
Joined: Sat Sep 26, 2009 2:14 pm
Location: Christchurch, New Zealand

Post by pmw57 »

Chaky wrote:
Honestly, I don't see what possible use I can have with binary-formatted plists, if they are not parsed by Oolite (and I'm on win-PC platform), but thanks for the link. There was a plist editor for windows there.
I haven't come across them myself either, but at a guess the benefit is to combine the wider variety of datatypes that xml supports, with a much smaller file size.
It makes sense for an editor that can generate xml plists, to be capable of generating binary plists as well.

A useful example is the propertylist editor (windows software) from http://modmyi.com/forums/windows-specif ... ditor.html
You can open and edit .plist files, change the format to binary (still see it in xml) so that when saved the OSE shipdata.plist file goes from 5841 KB down to 630 KB in size.

It seems that Oolite already supports the binary plist format as well.
A trumble a day keeps the doctor away, and the tax man;
even the Grim Reaper keeps his distance.
-- Paul Wilkins
Chaky
Deadly
Deadly
Posts: 213
Joined: Sat Aug 15, 2009 6:15 am

Post by Chaky »

Well, whaddayaknow.. Oolite does parse binary format...

Now I only need to break the code...
pmw57
---- E L I T E ----
---- E L I T E ----
Posts: 389
Joined: Sat Sep 26, 2009 2:14 pm
Location: Christchurch, New Zealand

Post by pmw57 »

Chaky wrote:
Well, whaddayaknow.. Oolite does parse binary format...

Now I only need to break the code...
Some useful links:

http://developer.apple.com/mac/library/ ... ction.html
http://developer.apple.com/mac/library/ ... Lists.html

And how to read/write plists in OpenStep, XML or Binary using the Core Foundation
http://developer.apple.com/mac/library/ ... oundation/
Reference/CFPropertyListRef/Reference/reference.html#//apple_ref/c/func/CFPropertyListWriteToStream


Edit: Oh, and a php parser: http://www.jeremyjohnstone.com/blog/200 ... files.html - I'll stop googling on this now.

Moderator: fixed horrible long link breaking line wrapping.
A trumble a day keeps the doctor away, and the tax man;
even the Grim Reaper keeps his distance.
-- Paul Wilkins
User avatar
JensAyton
Grand Admiral Emeritus
Grand Admiral Emeritus
Posts: 6657
Joined: Sat Apr 02, 2005 2:43 pm
Location: Sweden
Contact:

Re: Translating from XML to ASCII and back

Post by JensAyton »

Chaky wrote:
Mostly, the problematic files to translate are "missiontext.plist"s. I've came across plists in ASCII that use real CR/LFs as part of text. I assume that in XML, those CR/LFs must be translated to "\n"s, but still I came across XML plists that use (maybe by error) CR/LFs.

Another problem is translating quotes in XML. I assume that correct translation would be " for each ", but I need some confirmation on that.

Also, "("s and ")"s in XML are completely undocumented.. are "\(" and "\)" correct translation? And What about "{"s and "}"s?
Um, are you confusing XML and OpenStep format here? In XML plists, (which looks like this: <dict><key>foo</key><string>bar/string></dict>), the only things that need to be escaped are “<” (as “<”) and “&” (as “&”). In XML in general, “"” or “'” may need to be escaped in attributes (as “"” and “&apos;” respectively), but these are not used in plists.

In OpenStep format property lists, the only thing I’m aware of that must be escaped is “"” (as “"”). Other C-style escapes may be used, but are not required.
There is another dilemma I'm having: whether or not to put quote marks on keys and values that contain "_"s and "-"s (negative-value numbers excluded there). I've seen conflicting examples in current OXPs, and don't have the time nor willpower to test-drive each translation variant.
Using quotes around strings (including numbers) is never wrong. Oolite safely uses “_”, “.” and “$” in keys on all platforms, and “-” in values.

Apple binary plist format is supported on all platforms, but not recommended because it’s detrimental to the openness of the Oolite community. (GNUstep binary format is, unsurprisingly, supported on non-Mac platforms.)

Incidentally, Oolite contains a complete generator for cross-platform-compatible OpenStep plists, OldSchoolPropertyListWriting.m, but of course it’s in Objective-C. (In case anyone’s curious, it’s used solely to implement the obscure --compile-sysdesc and --export-sysdesc used for localizing the system description grammar.)
Chaky
Deadly
Deadly
Posts: 213
Joined: Sat Aug 15, 2009 6:15 am

Post by Chaky »

Is CR/LF (or new line, enter, return...) valid string in both OpenStep (I call it ASCII, according to elitewiki...) and XML?
I've seen "\n" used instead CR/LF in both formats, and I've seen real CR/LFs being used in both formats as part of string (texts in missiontext.plists).

My guess is that escaped one is for OpenStep and real CR/LF is valid in XML, but I need that confirmed.

The code ATM translates string'd CR/LFs in XML to \n in OpenStep and vice-versa.

And what about opened and closed quotes (“”)? Does those need to be escaped too? My guess is not, but I've been wrong before...
User avatar
JensAyton
Grand Admiral Emeritus
Grand Admiral Emeritus
Posts: 6657
Joined: Sat Apr 02, 2005 2:43 pm
Location: Sweden
Contact:

Post by JensAyton »

Chaky wrote:
Is CR/LF (or new line, enter, return...) valid string in both OpenStep (I call it ASCII, according to elitewiki...) and XML?
To the best of my knowledge, yes. (And don’t. OpenStep format is not ASCII-encoded, it’s UTF-8. You’ll note that the EliteWiki page on property lists does not mention ASCII.)
Chaky wrote:
I've seen "\n" used instead CR/LF in both formats, and I've seen real CR/LFs being used in both formats as part of string (texts in missiontext.plists).
\n will work in OpenStep format (again, it’s optional). Literal \ns work in XML missiontext.plists, but only because we handle them explicitly. They don’t work in any other plist.
Chaky wrote:
And what about opened and closed quotes (“”)? Does those need to be escaped too? My guess is not, but I've been wrong before...
No, as long as the file is output in UTF-8. (For XML, some other encodings, such as UTF-16, should work as long as you update the <?xml?> header appropriately. However, there’s no good reason to do this.)
Chaky
Deadly
Deadly
Posts: 213
Joined: Sat Aug 15, 2009 6:15 am

Post by Chaky »

Ahruman wrote:
Chaky wrote:
Is CR/LF (or new line, enter, return...) valid string in both OpenStep (I call it ASCII, according to elitewiki...) and XML?
To the best of my knowledge, yes. (And don’t. OpenStep format is not ASCII-encoded, it’s UTF-8. You’ll note that the EliteWiki page on property lists does not mention ASCII.)
It does here.

(Nevertheless, I'm calling it OpenStep from now on)
User avatar
Commander McLane
---- E L I T E ----
---- E L I T E ----
Posts: 9520
Joined: Thu Dec 14, 2006 9:08 am
Location: a Hacker Outpost in a moderately remote area
Contact:

Post by Commander McLane »

Ahruman wrote:
\n will work in OpenStep format (again, it’s optional). Literal \ns work in XML missiontext.plists, but only because we handle them explicitly. They don’t work in any other plist.
Do I understand you correctly here, that the \ns are actually not needed in missiontext.plist, and we could simply use CR/LF instead?

Or is there any good reason to stay with \ns? I'm asking, because personally I always found them awkward, and would happily drop them.
User avatar
JensAyton
Grand Admiral Emeritus
Grand Admiral Emeritus
Posts: 6657
Joined: Sat Apr 02, 2005 2:43 pm
Location: Sweden
Contact:

Post by JensAyton »

Commander McLane wrote:
Do I understand you correctly here, that the \ns are actually not needed in missiontext.plist, and we could simply use CR/LF instead?
This will definitely work for XML, and I believe it works for OpenStep. The only advantage to using \n, as far as I’m aware, is that it doesn’t conflict with file indentation.
User avatar
Eric Walch
Slightly Grand Rear Admiral
Slightly Grand Rear Admiral
Posts: 5536
Joined: Sat Jun 16, 2007 3:48 pm
Location: Netherlands

Re: Translating from XML to OpenStep and back

Post by Eric Walch »

Since above was written, we focus more and more on the OpenStep version of the plist.

The interesting part are arrays. There are two different syntaxes in use here:

A) The comma is used as a separator of elements.
B) The comma is used as a terminator of elements.

The difference is in the ending of the last element. As far as I know does Windows and Linux accept both versions. On the mac also versions after OSX Tiger do accept both syntaxes. And as Oolite 1.77 no longer supports OSX Tiger, any oxp written for oolite 1.77 or newer has no longer to bother about the difference.

For the mac that is important, because for Apple the official syntax in their programmers documentation is to not use the comma at the end. But their own plist software that was released after OSX Tiger does add that terminating comma. Till now I always removed them manually, or I used the old plist editor that shipped with OSX Tiger.
I mostly used a text editor for my OpenStep plists to avoid the comma problem with arrays.
Post Reply