Articles Archive
Articles Search
Director Wiki
 

Parsing Text and Lists

August 8, 1999
by Pat McClellan

Dear Multimedia Handyman,

I have field called "fonts" and its text is like this:

["MezzMM_375 RG", "MezzMM", "Cottonwood", "Lithos Regular", "Minion Condensed", "Minion Ornaments", "Poetica ChanceryI", "Utopia", "Viva BoldExtraExtended", "Viva Regular", ..."SurethingSymbols", "Zephyr"]

(Note: this list has been much condensed. The original list contained hundreds of font names.)

I need to convert this to an alphabetically ordered list (not Lingo list) that looks like this:

Arial
ArialBlack
ArialNarrow
BookAntiqua
BookmanOldStyle
BookshelfSymbol1
...
Wingdings
Woodcut
Xcast
ZapfElliptBT
Zephyr

So, I need to delete the quotes, commas and brackets, replacing the commas with line breaks. Can you help me do this without completely retyping the whole list?

Kumar K.

Dear Kumar,

I picked your question for a couple of reasons. First, there's nothing more useful that understanding the concepts of parsing text. I can't tell you how many times I use parsing techniques for things such as processing external text files, validating user input, etc.

Second, your question points out the fact that good developer use Lingo to help them author. What we'll be building is an authoring utility -- which probably won't ever be used during runtime. These type of authoring tools can save you a lot of time. More importantly, it can help you avoid typos in your data entry.

Now, let's get to the explanation. We'll need to do 2 things here: alphabetize the list and parse out the unwanted characters. Let's start by drawing a clear distinction between a Lingo list and a string. In Lingo, a string is any sequence of characters which are enclosed in quote marks. That same string of characters can appear in a field cast member, but fields ONLY hold strings, so you don't need the quote marks around it.

You're used to seeing a Lingo list as a list of things within brackets. Like this...

set myList = [#cat, #dog, #bird]
put myList
-- [#cat, #dog, #bird]

In myList, I have listed three items which happen to be in symbol format (preceded by the # sign). Now, I'll put myList into the text of a field member like this (assuming that I have a field member named "animals")...

put myList into field "animals"

Now, the text of field animals will look like this:

[#cat, #dog, #bird]

But here's the tricky part: if I try to put the text of field "animals" back into myList watch what happens:

set myList = the text of field "animals"
put myList
-- "[#cat, #dog, #bird]"

What's different? Note that this time when we put myList, the result is a string -- see the quote marks surrounding it? That means that we can't treat myList as a list anymore. I wouldn't be able to do any of the normal list functions on it. However, we can easily convert it back to a list from a string. Since the text of field "animals" is the proper syntax for a list (with the brackets and commas), all we have to do is this:

set myText = the text of field "animals"
set myList = value(myText)
put myList
-- [#cat, #dog, #bird]

Voila! By using the value() Lingo term, it looks at the string and converts it to the proper format -- in this case, a Lingo list.

The reason that I've spent this effort explaining the difference is that we need to treat this data as both a Lingo list and as a string. Why? Well, we need it to be a string for our character search, replace and delete operations. But, we need it to be a list so that we can easily alphabetize it using the list function, sort().

Let's start our code by putting the entire text of the field into a variable called "theString". Next, we'll convert theString into a list called "theList" (yeah, I'm real creative with these variable names...). Now that it's a list, we can alphabetize it by simply saying "sort theList". Now that the list is alphabetized, we need to put that back into a string format, using the string() function.


on parseField whichField
  set theString = the text of field whichField
  set theList = value(theString)
  sort theList
  set theString = string(theList)
  set howMany = the number of chars in theString
  
  repeat with i = howMany down to 1
    case char i of theString of
      ",":  put RETURN into char i of theString
      RETURN, QUOTE, SPACE, "[", "]" :delete char i of theString
    end case
  end repeat
  
  put theString into field whichField
  
end

The rest is a simple process of going through each character from the last to the first, testing to see if it's a bracket, space, quote, or comma -- then deleting or replacing it. Note that I use the Lingo terms RETURN, QUOTE, and SPACE.

When we've gotten through the entire string, we put theString back into the field, and we're finished.

Always put the text into a variable, parse the characters in the variable, then put the variable back into the field member. I don't know why, but it's significantly faster.

To use this handler utility, put it into a movie script (not a behavior) and call it from the message window. I've created this handler so that you need to specify whichField you want it to work on. So, in your case, you'd open the message window and enter:

parseField "fonts"

You'll be amazed how fast it is! Good luck with your project.

A sample movie is available in Mac or PC format.

Patrick McClellan is Director Online's co-founder. Pat is Vice President, Managing Director for Jack Morton Worldwide, a global experiential marketing company. He is responsible for the San Francisco office, which helps major technology clients to develop marketing communications programs to reach enterprise and consumer audiences.

Copyright 1997-2024, Director Online. Article content copyright by respective authors.