2009-10-13

Picky XML Interpreter and a Solution for Encoding Custom Node Names

Writing and Reading XML:

I have been working lots with Adobe's XML object within ExtendScript. There are some nice tools, but the biggest problem I've been having is dealing with its ability to create XML using custom nodes.
For example, I needed to write some Object Style names, not as data, but as the name of the node. This is because I needed to write some data that applied to the Object Styles. So I wanted XML node name to match the name of the Object Style.
That is fine if the name of the object style is something simple like 'Headline', but add a space as in 'Headline Frame' and suddenly there are problems. You can write the XML file because that is just text, and you can read the text file, but as soon as you ask ExtendScript to make a new XML object it will complain about the 'token'. And that makes sense because an XML node can't have a space. Nor can an XML node have < > & or %. Fine. So I wrote a function to encode those things. See entityReference.
It works to write and reread the file. And you can verify that the file is encoded, but when reading the file those items get automatically converted back into the normal strings and Adobe's XML parser still balks about the character.

Plan E: It at least works
After several other attempts, I finally settled on the encoding used in the two functions below. They leave the special characters (all but the alpha numerics) encoded with a
U_ and a 4 digit hexadecimal unicode glyph number. For example an — ( em-dash ) is:
U_2014
I would have liked to have used the more traditional:
0x followed by the unicode number as:
0x2014
but that places a number ( zero ) in the file. If the character needing to be encoded is the first character then the XML node would begin with 0 and once again, Adobe's Extendscript XML parser balks. So I settled on the U_ even though it was nonstandard.

Built-In Encoding Options
ExtendScript contains 3 pairs of encoding / decoding functions, but all three will trigger a complaint from the XML parser.
escape (aString) <--> unescape (stringExpression)
encodeURI (text) <--> decodeURI (uri)
encodeURIComponent (text) <--> decodeURIComponent (uri)

Using the epsEntitify functions:
If need to write non-standard node names to an XML file, just run the first function on the node names before creating them and writing the file. Here is the result of one node that includes two spaces:

Quote
Quote 1 Quote
It is at least marginally readable if you want to read the XML file itself.

Then when you read the XML, use the 2nd function to decode the node names. It works. The decoding is such that it very quickly returns if there is nothing to decode.

//
function epsEntitify ( str ) {
//-------------------------------------------------------------------------
//-- E P S E N T I T I F Y
//-------------------------------------------------------------------------
//-- Generic: Yes for ExtendScript.
//-------------------------------------------------------------------------
//-- Purpose: To replace the XML Reserved Characters in a passed string
//-- with custom values based upon hexidecimal versions of their
//-- unicode values and with a unicode U_ prefix.
//-------------------------------------------------------------------------
//-- Arguments: A string to clean up.
//-------------------------------------------------------------------------
//-- Calls: pad() to pad the hexideciaml value to 4 digits.
//-------------------------------------------------------------------------
//-- Returns: a string with all non word characters replaced with their
//-- hexidecimal value in a unicode format such as 'U_0020' for a space.
//-------------------------------------------------------------------------
//-- Sample Use:
//~ var unfitForXML = '<> close % percent'
//~ var safeForXML = epsEntitify ( unfitForXML ) ;
//-------------------------------------------------------------------------
//-- Notes: Using the .toString() method to convert to a hexideciaml value
//-------------------------------------------------------------------------
//-- Written: 2009.10.12 by Jon S. Winters of electronic publishing support
//-- eps@electronicpublishingsupport.com
//-------------------------------------------------------------------------
//-- Create a regular expression pattern for acceptable characters
var AlphaNumeric = new RegExp ('\\w');
//-- Create a return array ( it will be converted to a string at the end )
var eString = new Array (str.length) ;
//-- Loop through every character.
for ( var si = str.length - 1 ; si >= 0 ; si-- ) {
//-- Get a reference to the indexed character
var activeCharacter = str.charAt ( si ) ;
//-- If that character is included in the regular expression
//-- pattern then add it to the return array
if ( AlphaNumeric.test(activeCharacter) ) {
eString[si] = activeCharacter ;
}
else {
//-- It isn't an allowed character, convert it to a hexidecimal
//-- value. This uses a special feature of the built-in
//-- .toString() method to convert the value to hexideciaml
eString[si] = 'U_' + pad ( str.charCodeAt ( si ).toString(16) , 4 , '0' ) ;
}
}
//-- Convert the array to a string and send it back.
return eString.join ('')
}
//
//
function epsUnEntitify ( str ) {
//-------------------------------------------------------------------------
//-- E P S U N E N T I T I F Y
//-------------------------------------------------------------------------
//-- Generic: Yes, but has a very specific purpose.
//-------------------------------------------------------------------------
//-- Purpose: To take a string that has been processed with the
//-- epsEntitify() function and return it to its original values.
//-- The pair was written to encode XML files in ExtendScript
//-------------------------------------------------------------------------
//-- Arguments: str: the string to decode
//-------------------------------------------------------------------------
//-- Calls: Nothing.
//-------------------------------------------------------------------------
//-- Returns: The string decoded.
//-------------------------------------------------------------------------
//-- Written: 2009.10.13 by Jon S. Winters of electronic publishing support
//-- eps@electronicpublishingsupport.com
//-------------------------------------------------------------------------
//-- Create the custom pattern to find the special strings used by the
//-- epsEntitfy function. Note the parenthesis which are used for
//-- a backreferenece in the .exec() method later.
var p = new RegExp ( 'U_([0-9a-f][0-9a-f][0-9a-f][0-9a-f])' , 'gm' ) ;
//-- Loop through every match of that pattern
//-- Using the .text() method which returns true only if the
//-- passed string has a match.
while ( p.test ( str ) ) {
//-- Reset the pointer for the because the strings get
//-- shorter each time
p.lastIndex = 0 ;
//-- Use the .exec() method to determine the orignal string
//-- and the back reference.
//-- The result will have at least 2 values. [0] is the
//-- original string, and [1] is the backreference
var r = p.exec ( str ) ;
//-- convert the backreference into a base 10 number and
//-- then create a string using that character number.
var origChar = String.fromCharCode ( parseInt ( r [1] , 16) ) ;
//-- Use a basic search / replace to replace the base string
//-- with the original character.
//-- By using a regular expression and the 'gm' this
//-- can replace multiple matches at the same time.
str = str.replace ( new RegExp ( r [0] , 'gm' ) , origChar ) ;
}
return str ;
}
//

Traverse a Menu Item to Build a Folder Structure

I've been way to much lately with user interfaces. One problem with ExtendScript and building custom menu items is that you can't make keyboard shortcuts stick. The menus tend to get built each time the script launches. And each time they get built the menu items get unique ids within the Adobe application. The problem with this is that the keyboard shortcuts are tied not to a name, but to the unique id. Thus, keyboard shortcuts won't last longer than a single launch. If you are development mode, they won't even last that long.
The solution it seems is to create a stand alone script which calls the same function as the menu. These stand alone scripts would be visible to the user in the Scripts panel of Adobe InDesign or Adobe InCopy. Since these scripts are static ( as opposed to the ever changing unique ids of menu items ) their keyboard shortcuts stick.
The function below is a function called by the function that I use to build menu items. And it can look at a menu item and determine the its name and the name of all the submenus and menus above it and create what amounts to be a folder structure to place the script into.
Well that requires a few other functions...

//
function buildMenuFolderStructure ( aMenu ) {
//-------------------------------------------------------------------------
//-- B U I L D M E N U F O L D E R S T R U C T U R E
//-------------------------------------------------------------------------
//-- Generic: Yes, for ExtendScript
//-------------------------------------------------------------------------
//-- Purpose: To traverse a passed menu item to determine its relationship
//-- to parent menus and build a string of menus from it that can be
//-- be used for a folder path. For example if passed a menu item as
//-- a submenu of a menu added to the menu bar with this type of
//-- structure: Utilities --> Apply Page Grid --> 7 column grid
//-- this function would return:
//-- Utilities/Apply Page Grid/7 column grid/
//-------------------------------------------------------------------------
//-- Arguments: aMenu -- a reference to a menu
//-------------------------------------------------------------------------
//-- Calls: itself.
//-------------------------------------------------------------------------
//-- Returns: a string described above
//-------------------------------------------------------------------------
//-- Sample Use: var r = buildMenuFolderStructure ( aMenu ) ;
//-------------------------------------------------------------------------
//-- Notes: very little error checking. The top menu is named 'Main'
//-------------------------------------------------------------------------
//-- Written: 2009.10.06 by Jon S. Winters of electronic publishing support
//-- eps@electronicpublishingsupport.com
//-------------------------------------------------------------------------
if ( aMenu.parent.name == 'Main' ) {
return aMenu.name + '/' ;
}
//-- implied else
return buildMenuFolderStructure ( aMenu.parent ) + aMenu.name + '/' ;
}
//

2009-10-12

Place Tagged Text File

While some think of copying and pasting text, I prefer to write formatted text to an Adobe InDesign Tagged Text file and then read the file back at the desired location. I won't go into the reasonings behind this, but despite what you might think, it is very fast to do.
Below is a generic function to place a tagged text file at a location (such as a selection).


//
function placeTaggedText ( location , fileRef ) {
//-------------------------------------------------------------------------
//-- P L A C E T A G G E D T E X T
//-------------------------------------------------------------------------
//-- Generic: Yes. For both InCopy and Adobe InDesign.
//-- Tested with CS3, but should work with CS2 and newer
//-------------------------------------------------------------------------
//-- Purpose: To import the passed Tagged Text at the passed location.
//-------------------------------------------------------------------------
//-- Parameters: 2
//-- location: The location to place the file. Can accept anything
//-- that InDesign and InCopy will accept for the place function
//-- and it works in the same fashion as place.
//-- fileRef: A file object to the tagged text file to import.
//-------------------------------------------------------------------------
//-- Calls: Nothing.
//-------------------------------------------------------------------------
//-- Returns: true if the function was successful, or false if it was not.
//-------------------------------------------------------------------------
//-- Sample Use: Too many options to document.
//-------------------------------------------------------------------------
//-- Notes: In reality, this function will place ANY file. It doesn't
//-- test for tagged text files. But it does hard code some Tagged
//-- text import preferences.
//-------------------------------------------------------------------------
//-- Written: 2009.07.28 by Jon S. Winters of electronic publishing support
//-- eps@electronicpublishingsupport.com
//-------------------------------------------------------------------------
//-- Revised: 2009.09.13 for version 2.60 in New Orleans, LA to allow
//-- the placement to happen even if there is a dialog displayed.
//-- The problem was that the tagged text file can have missing fonts,
//-- and that really stops the script and throws a javascript error.
//-- Wrap the function in a try. The parameters can be setup incorrectly
//-- which will cause problems
try {
//-- Setup the Import Preferences for Tagged Text files
app.taggedTextImportPreferences.removeTextFormatting = false ;
app.taggedTextImportPreferences.styleConflict = StyleConflict.PUBLICATION_DEFINITION ;
//~ app.taggedTextImportPreferences.styleConflict = StyleConflict.TAG_FILE_DEFINITION ;
app.taggedTextImportPreferences.useTypographersQuotes = false ;
//-- Version 2.60 2009.09.13 prevent any user interaction.
var oaspuil = app.scriptPreferences.userInteractionLevel ;
app.scriptPreferences.userInteractionLevel = UserInteractionLevels.NEVER_INTERACT ;
//-- Now place the file -- simple. As a Text File that is marked as a
//-- tagged text file, it will just work.
var placeResults = location.place ( fileRef, false )
//-- Version 2.6 2009.09.13 return the user interaction levels
app.scriptPreferences.userInteractionLevel = oaspuil ;
//-- Check that the file was imported. This won't work when the wrong
//-- location is sent.
if ( placeResults == undefined ) { return false ; }
//-- implied else
return true ;
}
catch (err) {
return false ;
}
}
//