Articles Archive
Articles Search
Director Wiki
 

Censoring User Input

May 21, 2000
by Pat McClellan

I have an editable text field so the user can type in his(her) name in order to enter the game. I was wondering if there's any way I can aviod the user enter the game when the user enters inappropriate words from a list I made?

Pei-Chen Chen

Dear Pei-Chen,

There are two aspects to your question. The first is whether it is technically possible to compare the user text input to a list, and prevent the program from accepting inappropriate words as defined in a list. The answer to that is yes, it's fairly simple. The second aspect of your challenge is the hardest part: determining an algorith which adequately screens the undesirable words -- while accepting a wide and undefinable range of names.

I recently completed a program for a well-known family entertainment company which asked me to do exactly what you're asking. After much consideration, this is the approach we decided to take.

  1. User enters text and presses the "finished" button
  2. Each word of the user entry is checked to see if it is contained in the list of predefined, inappropriate words (held in a field)
  3. Each word in the list of inappropriate words is checked to see if it is contained in the user entry.
  4. If no match is found in either of the two checks above, the text is processed. Otherwise, the text is cleared and an alert message appears.

Here's a demo which operates on this algorithm.

D7 download for Mac or Windows.

Let's look at the two steps of language checking a little deeper. In order to make this article a bit less offensive, I'm going to use sports words as my "inappropriate" vocabulary. I'll assume that the list of unacceptable words is as follows:

wordList = ["baseball", "soccer", "tennis", "badminton", "polo"]

If the user enters "baseball", then the program will detect that the user entry is contained in the wordList and it will not be accepted. But what if the user entered "xbaseballx"? In this case, that first test will not detect the "hidden" word; but the second test will. The second test would check each word in the wordList and see if it is contained in the user entry. Since "xbaseballx" does contain "baseball", the entry would not be accepted.

Now we'll look at the Lingo involved. The first think I do is convert the field containing the words into a list (Lingo not shown below, but available in the demo.) I'm only showing a couple of the handlers -- the ones specifically handling the language check. When the user hits the "finished" button, that calls the finishMethod, which in turn calls the checkLanguage handler.

on finishMethod me
  
  myData = pMem.text
  
  if the last char of myData = "_" then
    delete the last char of myData
  end if
  
  if voidP(myData) or myData = "" then
    alert "Please enter your name by selecting the letters."
  else
    
    myData = checkLanguage(me, myData)
    cursor 0
    
    if myData <> "" then
      go to pNextMarker
    end if
    
  end if
  
end finishMethod


on checkLanguage me, myData
  
  cursor 4
  
  set wordCount = the number of words in myData
  
  repeat with i = 1 to wordCount
    
    set testWord = word i of myData
    if getPos(pWordList, testWord) > 0 then
      languageAlert me
      cursor 0
      myData = ""
    end if
    
  end repeat
  
  repeat with testWord in pWordList
    
    if myData contains testWord then
      languageAlert me
      cursor 0
      myData = ""
    end if
    
  end repeat
  
  return myData
  
end checkLanguage

You'll notice that I change the cursor to cursor 4 (the watch/hourglass) while the check is going on. I do this because if you have a long list of words, it might take a few seconds to process and you don't want the user confused by the pause.

The toughest part of this exercise is coming up with a list of inappropriate words. Thinking of the words isn't that tough, but finding words that are always unacceptable is. For example, in the project I recently completed, our list originally included "Klu", "Klux", and "Klan". You'd think those would be safe to include on the list. But when we asked the user to enter their hometown, we discovered that it wouldn't accept "Oakland" or "Aukland" or any other word that contained "klan". You can see that this makes it tough. For similar reasons, you can't include obvious words like "shit". If you download the demo, there's a field which contains the prohibited words that we ended up using.

In the end, it's safe to assume that if anyone really wants to enter something offensive, they'll find a way. You just can't check for every possible variation. Good luck with your program.

Patrick McClellan is Director Online's co-founder. Pat is Vice President, Managing Director for Jack Morton Worldwide, a global experiential marketing company. He is responsible for the San Francisco office, which helps major technology clients to develop marketing communications programs to reach enterprise and consumer audiences.

Copyright 1997-2024, Director Online. Article content copyright by respective authors.