Text On the Other Platform

From Director Online Wiki
Jump to: navigation, search

Windows, Macintosh, and Unix have three different ways of referring to a line break. Windows uses two separate characters - Carriage Return followed by Line Feed - while Macintosh uses just Carriage Return and Unix uses just Line Feed. To complicate matters, the fonts generally available on the platforms are different, and accented characters are also coded differently. While Director does its best to help you cope with all of this diversity, there are times when you need to take matters into your own hands. This article will show you how.

If you need to create a cross-platform multi-user chat application, or if you need to export and import text files that can be used on either platform, then this article is for you. You'll be discovering:

  • How much you can rely on Director's built-in font and character mapping
  • How the various line break characters appear on "the other platform"
  • How to convert files to the format used by the current platform, regardless of their origin
  • How to create files that can be used by your application on either platform.

Download a Director 8.0 cast containing a text conversion script and a wrapper script for the FileIO xtra for Windows or for Macintosh.


fontmap.txt

When you type text into a Director field or text member, Director knows the platform on which you are working. When you copy your movie or external cast to the other platform, Director realizes that the platform has changed. Director then uses its external look-up table to convert fonts and accented characters accordingly.

This look-up table is a text file called fontmap.txt. Prior to Director MX 2004, it appeared in the same folder as the Director application itself. In Director MX 2004, you will find it in the Configuration folder. In Windows, the fontmap.txt file will open in NotePad. On a Macintosh, I recommend that you use SimpleText or a third-party application like BBEdit that allows you to retain the Carriage Return line breaks.

Editing fontmap.txt on Mac OS X

On Macintosh, the chances are that the fontmap.txt file will be set to open in some other application by default. On my machine, it wants to open in CodeWarrior, for some reason. If you don't have CodeWarrior installed, it will probably want to open in Mac OS X's TextEdit.

Whatever you do, don't save any changes to it in TextEdit. TextEdit uses RTF format by default, which makes the file unreadable as far as Director is concerned. Even if you save it as raw text, TextEdit uses the Line Feed character for line breaks, which again makes it unreadable for Director. If you have administrator privileges, you can change the application used to open the fontmap.txt file by default by:

  1. Selecting the file in the Finder
  2. Selecting File|Get Info
  3. Expand the Open With tab in the dialog which opens
  4. Select the application that you want to use in the popup menu

Note: you may need to create a copy of the file in order to obtain the appropriate permissions.

If you want to edit the fontmap.txt file in Mac OS X (and you will see shortly why you might), and you are unable to run Classic applications, you can import the existing file into Director as a Script, using File|Import..., then make your changes. You can then use the Link As button at the Script tab of the Property Inspector to save the edited file out to your hard disk.

Mapping Fonts

The fontmap.txt file is divided into two parts. The first part allows you to define which Windows fonts to use in place of particular Macintosh fonts, and vice versa. As the image below shows, a number of common mappings are already included. You can adjust the size of the mapped fonts if you so desire.


http://nonlinear.openspark.com/wiki/XPlatform/FontMap.gif
FontMap.txt

Mapping High ANSI Characters

Computers don't know about characters. They don't understand anything more complex than zero or one. All of the characters that you can type into your computer are coded as a series of zeroes and ones. The binary number 11111111 corresponds to the decimal number 255 and for languages that use the Roman alphabet, or certain variations of it, 255 codes cover most eventualities, with the code 0 representing End Of File.

If you're working in English, you may only be concerned with the first 127 codes, which cover all the numbers, punctuation, and non-accented letters. Windows and Macintosh agree on how these first 127 characters should be coded. They disagree about the accented characters and other special symbols. Indeed, they don't even agree on what symbols to use.

Try this code on both platforms:

on highANSI
  vString = ""
  repeat with i = 128 to 255
    put RETUR&i&&numToChar(i) after vString
  end repeat
  put vString
end

Here's how the results differ:


http://nonlinear.openspark.com/wiki/XPlatform/highANSI.gif
Cross-Platform high ANSI characters

The second part of the fontmap.txt file deals with converting these high ANSI characters from one platform to the other.


http://nonlinear.openspark.com/wiki/XPlatform/CharMap.gif
Character Mappings

Editing fontmap.txt

There are two main reasons why you might want to edit the fontmap.txt file: to add new fonts, or to squeeze as much compression out of a Shockwave file as possible. Generally speaking, you won't want to alter the character mapping section.

If you want to use a non-standard font, such as Apple Chancery on Mac OS X, or on Windows, then you will need to add your own mappings for these fonts. Remember that you'll need to add mappings in both directions.

Below you'll find a Director movie opened in Word. As you can see, the fontmap.txt file has been added to the movie. In fact, the copy of the fontmap.txt file alongside the Director application is added to every movie you create. If you're adamant about creating minimal DCR files, you may want to strip out all unnecessary text: references to unused fonts and unused high ANSI characters, as well as all the comments. If the movie uses only low ANSI characters and fonts that appear on both platforms, you can use an empty fontmap.txt file.


http://nonlinear.openspark.com/wiki/XPlatform/WordDir.gif
Director Movie opened in Word

It's probably best to work on a copy of the original fontmap.txt. You can use the Property Inspector at the Movie tab to import a custom version of the file into your current movie:


http://nonlinear.openspark.com/wiki/XPlatform/LoadFontMap.gif
Loading the FontMap

What fontmap.txt Cannot Do

The character mappings in the fontmap.txt file are applied only to fields and text members saved in Director casts. If you need to import or export text using the FileIO xtra, then no character mappings are applied: the text is read in exactly as it appears on the hard disk. The same is true if you need to send text from one machine to another over a multiuser connection, as in a chat application.

Suppose you have a project where you need to read in external text that is likely to include high ANSI characters. How do you know if the file is in Windows or Macintosh format? How do you know whether or not the high ANSI characters need to be converted? As long as the text contains a line break, you should be able to make an educated guess.

Line Breaks

Director started life as a Macintosh application. As a result, it uses a single Carriage Return character - numToChar(13) - for line breaks on both Macintosh and Windows. If you write the contents of a field or text member to the hard disk using FileIO, then open the resulting file in NotePad, the line breaks will look distinctly odd, as as you can see here.


http://nonlinear.openspark.com/wiki/XPlatform/LineFeedWin.gif
Single Carriage Return on Windows

The solution is to replace all Director-style line breaks with Windows-style line breaks before you export. To do this, I use a generic ReplaceAll() handler. The code in the imageabove contains three lines that are commented out; if they were uncommented, the output would be in Macintosh format on a Macintosh and in Windows format in Windows.

on ReplaceAll(aString, aSubString, aReplacement) ---------------------
  -- INPUT: <aString> is the main string to search in
  --        <aSubString> is a string which may appear in <aString>
  --        <aReplacement> is a string which is to replace all
  --         occurrences of <aSubString> in <aString>
  -- OUTPUT: returns an updated version of <aString> where all 
  --         occurrences of <aSubString> have been replaced by
  --         <aReplacement>
  --------------------------------------------------------------------
 
  if aSubString = "" then
    return aString
  end if
 
  vTreatedString = ""
  vLengthAdjust  = the number of chars of aSubString - 1
 
  repeat while TRUE
    vOffset = offset(aSubString, aString)
    if vOffset then
      if vOffset - 1 then
        put chars(aString, 1, (vOffset - 1)) after vTreatedString
      end if
     
      put aReplacement after vTreatedString
      delete char 1 to (vOffset + vLengthAdjust) of aString
     
    else -- there are no more occurrences
      put aString after vTreatedString
      return vTreatedString
    end if
   
  end repeat
end ReplaceAll

The extra Line Feed character used by Windows may appear in different guises on a Macintosh, depending on the application that displays the text. In SimpleText, it appears as rectangles at the beginning of each new line. In Director, it appears as an extra blank line in both field and text members, though it may appear as a rectangle in the Cast window thumbnails:


http://nonlinear.openspark.com/wiki/XPlatform/LFonMac.gif
numToChar(10) on Macintosh

Cross-Platform Text Files

So, if Windows-style line breaks look strange on a Macintosh, and the line breaks used by both Director and Macintosh misbehave on Windows, how can your application work cross-platform with text files?

My solution is to export the files in Windows format and to check the format on import and make the necessary conversions. Checking the format means testing what line break character is used in the imported file, and using that to decide how the high ANSI characters have been encoded:

on GetFormat(aString) ----------------------------------
  -- INPUT: <aString> must be a character string
  -- OUTPUT: Returns #win, #mac or #unix depending on
  --         which characters are used for line breaks.
  --         If no line breaks appear, Mac format is
  --         assumed.
  -----------------------------------------------------
 
  -- Define the various line break characters
  vLF = numToChar(10) -- line feed
  vCR = numToChar(13) -- carriage return
 
  -- Auto detect the string format based on line breaks
  if offset(vLF, aString) then
    -- Windows or Unix
    if offset(vCR, aString) then -- CRLF
      return #win
    else -- LF only
      return #unix
    end if
  else -- CR only
    return #mac
  end if
end GetFormat

Suppose the file is in Windows format, and the application is running on Windows. All that needs to be done is to remove the extra Line Feed characters, so that Director can display it correctly.

If the application is running on a Macintosh, the high ANSI characters need to be converted as well. This means that the ANSI code for every character needs to be checked, and characters whose codes are greater than 127 need to be converted. To do this, I create a Director list that contains the same information as the character mapping data in the fontmap.txt file. You might like to compare the first few entries in the tFontMap list in the handler below with those in Character Mappings image above.

on ConvertWinStringToMac(aString) --------------------------- 
  -- INPUT: <aString> must be a string with Windows line
  --         breaks and character encoding
  -- OUTPUT: Returns a string whre line breaks and characters
  --         are encoded for Macintosh
  -----------------------------------------------------------
 
  -- Create a list based on the data in FontMap.txt.  On Mac,
  -- the character numToChar(128) maps to numToChar(196) on
  -- Windows.  Item n in this list shows the Windows mapping
  -- for n + 127 on Mac.
  vWinFontMap = [ \
196, 197, 199, 201, 209, 214, 220, 225, 224, 226, 228, 227, \
229, 231, 233, 232, 234, 235, 237, 236, 238, 239, 241, 243, \
242, 244, 246, 245, 250, 249, 251, 252, 134, 176, 162, 163, \
167, 149, 182, 223, 174, 169, 153, 180, 168, 141, 198, 216, \
144, 177, 143, 142, 165, 181, 240, 221, 222, 254, 138, 170, \
186, 253, 230, 248, 191, 161, 172, 175, 131, 188, 208, 171, \
187, 133, 160, 192, 195, 213, 140, 156, 173, 151, 147, 148, \
145, 146, 247, 215, 255, 159, 158, 164, 139, 155, 128, 129, \
135, 183, 130, 132, 137, 194, 202, 193, 203, 200, 205, 206, \
207, 204, 211, 212, 157, 210, 218, 219, 217, 166, 136, 152, \
150, 154, 178, 190, 184, 189, 179, 185]
 
  vTemp   = aString
  aString = ""
 
  vCount = the number of chars in vTemp
  repeat with i = 1 to vCount
    vChar = char i of vTemp
   
    vCharNum = charToNum(vChar)
    if vCharNum = 10 then
      vChar = "" -- remove Line Feed
     
     else if vCharNum > 127 then -- convert to Mac code
       vAdjust = vWinFontMap.getPos(vCharNum)
       vChar   = numToChar(127 + vAdjust)
    end if
   
    put vChar after aString
  end repeat
 
  return aString
end ConvertWinStringToMac

I use a similar conversion process to convert from Macintosh format to Windows format before I save a file to disk on a Macintosh. This means that the file is converted twice on a Macintosh (once on import and once on export), while all that happens in Windows is that Line Feed characters are added or deleted.

In fact, the complete handlers that you'll find in the downloadable cast can handle conversion from Mac, Unix, or Windows formats to either Windows or Mac. The cast also includes a script containing much more robust versions of the WriteToFile() and ReadFromFile() handlers mentioned previously. These are called FileWrite() and FileRead(). Here are handlers that you could use for importing and exporting text cleanly on both platforms.

on Export(aString) ---------------------------------------------
  -- INPUT: <aString> must be a character string
  -- ACTION: Writes the string to disk using Windows format
  -- OUTPUT: Return 0 for no error, or an error symbol
  --------------------------------------------------------------
 
  aString = ConvertToWin(aString) -- Carriage Return + Line Feed
  vResult = FileWrite(aString)
 
  return vResult
end Export
on Import(aFilePath) -------------------------------------------
  -- INPUT: <aString> must be a string absolute file path
  -- ACTION: Reads in text from the given file and converts it
  --         to the Director format for the current platform
  -- OUTPUT: Return a string, or an error symbol
  --------------------------------------------------------------
 
  vString = FileRead(aFilePath)
 
  if stringP(vString) then
    if the platform starts "Mac" then
      vString = ConvertToMac(vString)
    else
      vString = ConvertToWin(vString, TRUE) -- use CR only
    end if
  end if  
 
  return vString
end Import

Conclusion

Director takes care of most of the platform differences between Windows and Macintosh, as far as fonts and text encoding are concerned. If you need to import text on the fly, you can mimic the techniques used by Director to convert to the appropriate format, as required.